Spatio-temporal coupled mode theory for nonlocal metasurfaces

Diffractive nonlocal metasurfaces have recently opened a broad range of exciting developments in nanophotonics research and applications, leveraging spatially extended—yet locally patterned—resonant modes to control light with new degrees of freedom. While conventional grating responses are elegantly captured by temporal coupled mode theory, current approaches are not well equipped to capture the arbitrary spatial response observed in the nascent field of nonlocal metasurfaces. Here, we introduce spatio-temporal coupled mode theory (STCMT), capable of elegantly capturing the key features of the resonant response of wavefront-shaping nonlocal metasurfaces. This framework can quantitatively guide nonlocal metasurface design while maintaining compatibility with local metasurface frameworks, making it a powerful tool to rationally design and optimize a broad class of ultrathin optical components. We validate this STCMT framework against full-wave simulations of various nonlocal metasurfaces, demonstrating that this tool offers a powerful semi-analytical framework to understand and model the physics and functionality of these devices, without the need for computationally intense full-wave simulations. We also discuss how this model may shed physical insights into nonlocal phenomena in photonics and the functionality of the resulting devices. As a relevant example, we showcase STCMT’s flexibility by applying it to study and rapidly prototype nonlocal metasurfaces that spatially shape thermal emission.


I. Introduction
Over the years, coupled mode theory (CMT) has become a powerful tool to model and understand the scattering response of complex resonant electromagnetic structures [1]- [2]. Central to CMT is the phenomenological identification of: (i) the resonant modes supported by the system, and (ii) ports, or channels, that carry energy to and from the resonant scatterer. In turn, CMT provides an intuitive and accurate framework to study a wide range of resonant devices. In the spatial domain [3]- [5], it is commonly used to study waveguide modes and their coupling; while in the time domain temporal coupled mode theory (TCMT) [6]- [7] has found marked success in modelling spectral features such as Fano resonances and the construction of scattering matrices. This technique can be applied both to localized resonances [8]- [11] and to extended resonances supported by metasurfaces, photonic crystal slabs and subwavelength gratings [12]- [17]. However, in TCMT the spatial distribution of the mode is ignored: the modal amplitude a is assumed to be function only of time. This feature in turn limits TCMT to studying the spectral features of the device, while detailed spectro-spatial information of the modes themselves is not captured. For this reason, TCMT is primarily effective at studying infinitely periodic grating structures, while finite and spatially varying gratings cannot be straightforwardly modeled.
The spectro-spatial properties of photonic resonant systems are important in various settings, particularly when the optical energy is correlated across distances larger than the wavelength, i.e., when nonlocality, also known as spatial dispersion, cannot be neglected. In turn, these systems exhibit strong dependence of the optical response on the momentum of light [18]- [19].
A sharp response in momentum space is referred to as nonlocal because the corresponding extent of the resonant modes in real space is necessarily broad. Nonlocality in the material permittivity has been studied in plasmonic systems for decades [20]- [23], associated with delocalized material responses that link the polarization field at a given point to the electromagnetic fields at neighboring points. More recently, epsilon-near-zero materials [24]- [26] and metamaterials [27]- [31] have been 4 nonlocal metalens with hyperbolic phase distribution across the aperture can selectively engage only fields originating from its focal spot, while it remains transparent for other wavefront shapes [55].
This spatially selective response, for which the metasurface is only resonant for specific incoming wavefront shapes that match its local pattern of perturbations, generalizes the angular selectivity well-known for resonant gratings, which is limited to plane waves. Since their response is spectrospatial in a nontrivial way, however, it cannot be captured by TCMT. Rather, a new theoretical framework featuring a spatially varying modal parameter ( ) , a x t is required [ Fig. 1(b)]. Figure 1(c) depicts an example nondiffractive nonlocal metasurface (based on Ref. [54]), whose meta-unit cells are subwavelength and transversely invariant, hence compatible with a TCMT description. Figure   1(d) on the contrary depicts an example diffractive nonlocal metasurface based on the same metaunit cells, but with a spatially varying orientation angle (from left to right), incompatible with a TCMT description and requiring a generalized CMT model.
To address this need, here we introduce a general spatio-temporal coupled mode theory (STCMT) that extends TCMT to capture spatially varying resonant modes. We show excellent agreement with full-wave simulations of spatially selective diffractive nonlocal metasurfaces [55], demonstrating the validity and relevance of this model. Our theory introduces a nonlocality length, which quantifies the degree of nonlocality of a given device, and it establishes how this parameter directly controls the degree of spatial selectivity. We clarify two distinct regimes of operation for nonlocal photonic devices, governed by the numerical aperture of the anomalous diffraction encoded into the device, and we demonstrate the relation between spatial selectivity and the supported eigenmodes. As such, STCMT elegantly captures the fundamental working mechanisms of spatially varying resonant photonic systems, and it provides physical insights into the nature of nonlocality in this emerging class of novel devices, enabling rational designs and rapid, computationally efficient analysis of next-generation photonic systems.

II. Spatio-temporal coupled mode theory
In the following, we develop STCMT by writing down the appropriate dynamical equations, determining the constraints stemming from time-reversal invariance, reciprocity and energy conservation, and then computing the scattering from a general nonlocal device by appropriately specifying its parameters.

II.A Dynamical equations
6 We obtain the spatio-temporal dynamical equations by appropriately manipulating and tailoring standard TCMT equations [6]. Under an i t e  − time convention, and focusing on a single resonance, these equations are Equation (1) describes the dynamics of the complex modal amplitude () at excited by an incoming () at is normalized as the energy stored in the q-BIC per unit area at time t and ss ++ is the incident intensity. The in-coupling vector  contains the coupling coefficients to the mode from free-space excitation, and 0 i  = − is a complex frequency, with 0  being the (resonant) modal frequency and  being the total scattering rate, which we take to be purely radiative throughout. Extension to include absorption loss is straightforward. Equation (2) describes the dynamics of the outgoing (scattered) waves s − (also normalized such that ss −− is the outgoing intensity) as the interference of the "background" scattering, described by the matrix C , and the mode leaking to free space, described by the out-coupling vector d (see Supplementary Section S1 for additional background). Notably, the first-order nature of the temporal dynamics in conventional TCMT is rooted in Maxwell's equations, regardless of the modal properties [1].
In contrast, the dynamical equations for a device with space-varying modal properties must inherently depend on the nature of the mode itself. In other words, we need at the outset to know the form of the dispersion of the mode in order to capture its spatial properties.
We expect, for instance, the phenomenology of a travelling wave to differ from the one of a standing wave, and hence we expect distinct dynamical equations. The most general approach to capture this dependence is a Taylor expansion for the resonance frequency in terms of its momentum k dependence: ( ) 22 For instance, a traveling wave is captured by the linear term with speed c , while a standing wave with parabolic dispersion, such as near a band edge, is captured by the coefficient b . As detailed in Supplementary Section S2, transforming Eqns. (1) and (2) Equations (4) and (5) are the dynamical equations for STCMT, including spatial dispersion in both the coupling properties (convolution terms on the right-hand sides) and the modal properties (spatial derivative terms on the left-hand sides). We note that this is can be readily extended to higher order expansions if needed, but in the most common scenarios the coefficients in Eqn. (4)  Next, we specialize our general STCMT to capture the novel functionalities demonstrated by recent diffractive nonlocal metasurfaces, as in Refs. [51]- [58]. These devices leverage q-BIC modes, whose Q-factors are sufficiently large such that the background varies slowly compared to the resonant features [6]. In their infinitely periodic implementation, they abide a TCMT description with 12 0 c  === near normal incidence. In other words, they are well captured by a parabolic band with a Q-factor that is not a function of the incident angle [62]. Their design assumes that the scattering of a q-BIC can be spatially tailored by locally changing a small symmetry-breaking perturbation, and that distant perturbations do not affect the local polarization or phase [52]. Hence, the nonlocality of these devices is assumed to be purely modal, or captured by the spatial derivatives in the left-hand side, while the coupling to and from the mode is spatially instantaneous (local): Together, these assumptions yield the dynamical equations where we simplify the notation to 0  = . Supplementary Section S2b derives the same set of equations using the leading order terms of a cosine series expansion.

II.B Scattering Matrix, Propagator and Green's Function
In TCMT, the scattering matrix is obtained by assuming time-harmonic solutions, solving Eqn. (1) for ( ) at , and then inserting the result into Eqn.
(2) to yield [6] ( ) ( ) ( ) In comparison, in STCMT we must solve Eqn. (7) before inserting it into Eqn. (8). To do so, we recognize that, when no source is present, we obtain a Schrödinger equation with complex potential 0 Vi  =+, recognizable by the rearrangement: Compared to the conventional Schrödinger equation, we have 1 = and 1/ bm = , consistent with the expectation that the band curvature plays the role of an inverse effective mass. We also note that non-Hermiticity (a complex valued potential) is expected of an open or damped system.
In absence of the potential V , we have the equation for a free particle, for which the solution is described by the propagator K and the initial condition ( The addition of a spatially constant V adds a phase factor independent of x . Hence, in our case the propagator is ( ) In addition to solving the homogenous equation, the propagator yields the Green's function t G for the inhomogeneous equation (7): 10 where ( ) t  is the Heaviside step function. Using these functions, we may apply physical constraints such as conservation of energy (which takes the form of a continuity equation), time-reversal invariance and reciprocity (see Section II.C and Supplementary Section S4).
In practice, we are interested in the response to monochromatic waves [ seek the solution to the scattering problem in the space-frequency domain. In this case, Eqn. (7) becomes and we seek the Green's function satisfying which yields the modal amplitude where the scattering kernel  correlates input positions x with output positions x , and where the closed form of the Green's function is parameterized by the complex coefficient

11
This quantity was originally introduced to capture the spatial coherence of thermal metasurfaces, derived with alternative methods [56]. Its real part takes an especially simple form at the band edge which we call the nonlocality length. This length is the characteristic distance that light travels inplane before scattering out when engaging with the band-edge mode; in other words, it is a quantitative measure of the nonlocality of the metasurface at the band-edge frequency. Since nonlocality is equivalently defined as a dependence on the incident wavevector, no optical interface is truly local. However, as a rule of thumb, if the nonlocality length is larger than the wavelength in the surrounding media, the resulting device may be considered nonlocal, since its nonlocality cannot be neglected (see Supplementary Section S3 for a comparison of a nonlocal metasurface to a bare interface between two media).

II.C Physical Constraints
Before proceeding with the application of STCMT to practical metasurface devices, we must determine the relevant physical constraints to our system, including time-reversal invariance, reciprocity and conservation of energy. In TCMT, these constraints impose the well-known requirements [6] 2 dd These results hold, for instance, in infinitely periodic devices supporting q-BICs (see Supplementary Section S1). As detailed in Supplementary Section S4, we find that in our STCMT all these conditions hold locally, i.e., We stress that this result is not obvious a priori and, indeed, it does not hold for the general case described by Eqns. (4) and (5). Future work may explore systems with nonlocal coupling (i.e., devices for which Eqns. (6) do not hold), in which case Eqns. (28) and (29) are no longer valid.
The significance of this result for modelling diffractive nonlocal metasurfaces should not go underappreciated. This property stems from the shift-invariance of the Green's function in Eqn. (21) , meaning that the spatial dependence of the nonlocal scattering is wholly contained in the coupling The assumption that the modal properties 0 , , b are space-invariant, i.e., that the local perturbations tailoring the spatial response of the metasurface do not affect them, and that the coupling is spatially instantaneous [Eqn. (6)], confer a major simplification to model this class of systems and enables the introduction of a semi-analytical model to describe them. In particular, each unit cell of a nonlocal metasurface may be modeled as if it were in an infinite array, allowing easy correlation between geometric and phenomenological degrees of freedom, compatible with TCMT in the momentumfrequency domain. Then, a given space-varying nonlocal metasurface may be modelled by simply specifying a spatial profile of the coupling coefficients, wherein the correlations across the surface are captured in STCMT by the shift-invariant Green's function. This design flow is compatible with the foundational design principles of local metasurfaces [63], wherein a library of pre-computed optical scatterers may be arrayed across the surface as if they operate independently of their nearest neighbors. This ubiquitous design approach is sometimes called the "local phase approximation". Yet, we find that, when Eqn. (6) holds, the same design principles are applicable even in the much more complex scenario discussed in this work, considering arbitrarily nonlocal responses due to a guided mode.

II.D Eigenwaves
STCMT enables us to rigorously define characteristic 'eigenwaves' for nonlocal metasurfaces with arbitrarily varying spatial profiles. An eigenwave can be defined as a spatially varying wavefront that a nonlocal metasurface is selective for: maximal reflectance is achieved when the eigenwave is incident, and the reflected wave is the eigenwave's time-reversed copy (preserving coordinate system handedness, in contrast to specular reflection) [55]. Using 11

 =
as the reflection kernel for light incident from side 1, the eigenwave from side 1 is solution to the relation or equivalently Combing Eqns. (30) and (31) we find suggesting that the eigenwave of interest is the eigenvector of that has the largest eigenvalue eig R , i.e., the eigenvector with maximal reflectance. Equation (33) implies the existence of higher-order eigenwaves that reflect to their time-reversed copy but with non-maximal reflectance. It also suggests the existence of eigenwaves for frequencies other than the band-edge frequency 0  .

III. Diffractive Nonlocal Metasurfaces
In this section, we apply the introduced STCMT to illustrative examples, confirming that our analytical framework reproduces the results of full-wave simulations, quantitatively captures the underlying physics and enables rational designs for relevant functionalities. Schematically depicted 14 in Figure 2(a), we explore systems based on Refs. [54] and [55], In particular, we use the chiral q-BIC system in Refs. [54] and [55], in which the response was shown to: (i) have zero background reflection, (ii) exclusively engage circularly polarized light of one handedness, and (iii) decay with a geometric phase of 2 for a meta-unit characterized by in-plane orientation angle  . After judiciously choosing the geometrical parameters defining the ellipses, we may use a scalar basis (i.e., ignore the opposite handedness of circularly polarized light, which does not engage the resonance), and up to a constant reference phase we have (see Supplementary Section S1) where the first element is the coefficient scattering to side 1 and the second element is the coefficient scattering to side 2. In diffractive nonlocal metasurfaces, we allow  to vary spatially: For a system based on the meta-units in Fig. 2 (and Refs. [54]- [55]), this yields [from Eqns. (20) and . (36) To improve clarity, in what follows we detail the behavior for light incident from side 1, in which case we may separate the complex scattering kernel into a complex reflection kernel 11

 =
and a complex transmission kernel The other kernels 12  and 22  are treated equivalently, but omitted here for brevity. Figures 2(g,h) show the magnitude and phase of 11  for the meta-unit corresponding to Figs

III.A Linear phase gradient: analytical model
In this sub-section, we focus on nonlocal metasurfaces with a linear phase gradient. In particular, we consider a gradient in the geometric angle following . (38).
In this case, the nonlocal kernels take the form  Schematic of the coupled mode theory approximation of the scattering in (a), wherein independent scattering events occur at the interfaces, each of which may generally couple to both side 1 (down) and side 2 (up). Reflectance (c) and reflection phase (d) in the momentum-frequency response of an example meta-unit, computed by Finite Difference Time Domain (FDTD). Reflectance (e) and reflection phase (f) in the momentum-frequency response of an example meta-unit, modeled using TCMT. Amplitude (g) and phase (h) of 11  in the space-frequency domain of the same metaunit, a key object for modeling this system using STCMT.
When 0 G k = , we recover the nonlocal reflection kernel without phase gradient, which we denote with the subscript 0 : Since a phase gradient provides a spatially constant momentum kick, it is natural to study these devices in momentum space. In particular, we seek the scattering matrix ) ,( ) , where k and k are the basis wavevectors of the incoming and outgoing wavevectors, respectively, and the indices refer to the top and bottom sides of the metasurface. We then need to relate this scattering matrix to the scattering kernel  , which, as shown in Supplementary Section S5, is the mixed Fourier transform Using this relation, the scattering components for a metasurface without phase gradient are Notably, this result is consistent with TCMT, with the explicit addition of a Dirac  term enforcing scattering only when kk  = (conservation of momentum in a specular process).
Meanwhile, the scenario with nonzero which implies, due to the basic properties of the Fourier transform The impact of a nonlocal phase gradient is to shift the scattering matrices in k-space, in both the input , and from side 2 only when 2 . Hence, our results confirm that the anomalous reflection is associated with the 2 m =  diffraction orders, and the direction of momentum kick imparted by the nonlocal metasurface depends on which side light is incident from.

III.B Linear phase gradient: numerical simulations
We now validate our analytical model numerically for 2 /  In contrast, in the presence of a phase gradient the device retroreflects light at the resonance frequency, as shown in Fig. 3(c). From side 1, the band edge mode is shifted in k-space by G k , meaning that the resonance only occurs for This shift, as well as the anomalous reflection, is encoded in  and  within the phase information phenomenon, requiring vertically asymmetry in the structure to achieve (see also the discussion in [48]).

III.C Nonlocal metalens
While the previous section confirmed that our STCMT extends conventional TCMT modeling to phase gradients, we now demonstrate how this accurate modeling is actually retained even when G k varies arbitrarily across the device [55]. We consider nonlocal metalenses as a canonical example of spatially varying nonlocal metasurfaces of broad relevance for applications. We first analytically study the eigenwaves of nonlocal metalenses, and then numerically demonstrate excellent agreement with the spatial selectivity demonstrated in Ref. [55]. Our validation with full-wave simulations then motivates us to use STCMT to more thoroughly study nonlocal metalenses with this quick yet accurate tool, shedding more light on their operation. In particular, we demonstrate that, as a function of the band structure b , the Q-factor 0 r Q  = , and the numerical aperture of the metalens NA , a nonlocal metalens can transition between two regimes of operation: wavefrontselective and wavefront-shaping. We also explore in Supplementary Section S10 the eigenwaves for frequencies off the band edge, showing their relation to spatial selectivity and the underlying band structure of the q-BIC.
In this section, we study nonlocal metalenses described by hyperbolic phase gradients of the form To find the eigenwave we use relation (33), which uses the terms ,, The eigenwave of the metalens is therefore the solution to By inspection, as W →  (i.e., ignoring boundary effects), we have because in this case which is identical to the spatially invariant q-BIC modeled by the subscripted kernels. For finite W and radiative boundaries, Eqn. (52) does not yield a perfectly unity result, as expected for a finite device and as discussed in Supplementary Section S8. Nevertheless, the idealized solution (51) captures the underlying idea: this nonlocal metalens is selective to a point source set at zf = , just as described in Ref. [55]; the eigenwave is the condition in which the phase profile is cancelled everywhere across the device, yielding the same reflectance analogous to a normal incident plane wave exciting a periodic grating.
However, our quantitative definition of an eigenwave introduced here allows us to numerically describe non-ideal eigenwaves for finite structures, including radiative boundary conditions. To do so, we employ again the discrete matrix form of our STCMT. The nonlocal reflection matrix for this scenario is seen in Fig. 4(a,b), from which the eigenwave eig E is computed, depicted in Fig. 6(c) overlaid with the reflected wave ( eig E  ), which is the time-reversed copy of the input as expected.
We also may numerically compute the reflectance due to waves incident from side 1 with Ref. [55], we numerically compute the reflectance as a function of wavelength and 0 z [ Fig. 4(d)], and find remarkable agreement with the results from full-wave simulation in Fig. 4(e). The peak reflectance in the STCMT is near-unity, while full-wave simulations predict peak reflectance near 24 70% . This shows that the metalens has room for optimization in its design. Regardless, our STCMT can clearly capture the response of the nonlocal metalens. Next, we explore the response of the system in Fig. 2 as a function of b , Q , and numerical aperture, , where n is the index of refraction of the surrounding medium. The NA characterizes the range of deflection angles encoded into the q-BIC, and hence it also characterizes the range of resonant wavelengths across the device for normally incident light. As the NA increases, the part of the nonlocal metalens resonating at a given frequency is reduced, and the reflectance also diminishes. In other words, as expected, spatial selectivity increases with the lens numerical aperture.
This feature is confirmed numerically in Figs Importantly, the transition from wavefront-selective to wavefront-shaping is also a function of the degree of locality. When the nonlocality length is substantially larger than the free-space wavelength, the device is highly wavefront-selective, i.e., it only acts as a wavefront-shaping device for very low NA . But when the nonlocality length is comparable to the free-space wavelength, i.e., it supports a large degree of a localization, the device is highly wavefront-shaping, i.e., it acts as a wavefront-shaping device for a large range of NA . We find (Supplementary Section S11) that the wavefront-shaping regime is satisfied when or equivalently We therefore may quantitatively conclude that the degree of spatial selectivity is due to the degree of nonlocality of the q-BIC. Supplementary Section S12 further quantifies this feature by discussing the expansion in the eigenbasis of the device, an analysis related to modal expansions in optics [64].
We now explore devices with varying b , Q , and NA to demonstrate the transition between wavefront-selective and wavefront-shaping recipes. Figure 5(e) shows the peak reflectance for a normally incident plane wave 0 z =  for a device with NA 0.1 = as a function of b and Q : when they are both large, i.e., a highly nonlocal device, the plane wave is transmitted unperturbed through the device, indicating wavefront-selectivity. When they are both small, i.e., a highly local device, the plane wave is anomalously reflected with near-unity efficiency, and the device is wavefront-shaping. The
Engineered nonlocalities have been recently shown to provide a powerful platform to shape thermal emission and photoluminescence from metasurfaces, and STCMT may enable rapid study and prototyping of focused incoherent light emission with respect to the defined nonlocality length and the design parameters. To tackle this problem, we need to extend the STCMT formulation in three ways: we (i) account for polarization ports in a reflection-and absorption-only scattering problem, , and we allow the background scattering to follow an idealized geometric phase metasurface implemented by pillars with orientation angles ( ) Meanwhile, for perturbations governing the q-BIC defined by an orientation angle  , the coupling coefficients are [66] ( ) Finally, to account for the presence of loss, we make the simple adjustment For clarity, we may also separate the scattering kernel into its local and nonlocal components where 28 Then, assuming that the power not reflected is absorbed by the device, the absorption of a given input wave can be simply computed as while the corresponding thermal emission is described by the time-reversed wave, following the universal modal radiation laws [67]. for three example Qfactors.
As a canonical example, Fig. 6 studies a thermal metalens that preferentially emits to a focal point f with a circular polarization, encoded using the profiles [56] ( ) for the chosen polarization   1 / 2 i . Figures 6(a,b) show Beyond specific examples, we may rapidly compute the full width at half maximum (FWHM), x  of the thermal metalens response as a function of Q-factor and NA . Figure 7(a) shows how 0  and the peak absorption peak depend on Q . Notably, we see a drop off in peak around the point where 0 W  = , i.e., when the nonlocality length is comparable to the aperture size (width) of the lens.
This naturally stems from incomplete interference: the finite size of the lens is too small, and the q-BIC leaks out the sides of the metalens instead of building up a complete Fano resonance through destructive interference. Figure 7(b) shows the absorption at the focal plane 0 zf = as a function of Q , showing a continuous trend consistent with Fig. 6(d) when 0 W   . However, when 0 W   we notice instead that the FWHM plateaus, as confirmed in Fig. 7(c) above a certain Q . Below that threshold Q-factor, the target NA , together with Q , determine x  ; above that threshold Q-factor, x  is determined only by the NA . From this, we learn that increasing the Q-factor does not linearly improve the performance of a thermal metalens. Instead, there is a threshold Q above which the focal spot does not continue to tighten; instead, increasing the Q-factor only reduces the intensity of the response. While the transition near 0 W  = is gradual, yielding a 'gray area' rather than a sharp threshold, the distinction between the two regimes is clear. Figure 7 suggests, intuitively, that to improve the response further (i.e., reduce x  while maintaining 1 peak  ) we must increase the metalens aperture (W ) while also increasing the coherence across the aperture ( 0  ).

V. Discussion and Conclusion
TCMT is the foundational model and framework for studying infinite, periodic resonant flat optics. In  In addition to its practical advantages, among the key benefits of CMT is the potential to add quantitative and qualitative understanding to the underlying physics of a resonant response. STCMT captures and provides insight into several novel aspect of recent efforts on nonlocal metasurfaces: directly stemming from STCMT, we clarified that: (i) the two hallmarks of a nonlocal phase gradient (shift of band structure in input angle, and anomalous reflection) are simple consequences of a mixed Fourier transformation in the context of STCMT; (ii) a symmetric response after such mixed Fourier transformation is enforced by reciprocity, revealing that a near-unity efficiency nonlocal phase gradient is a vertically asymmetric phenomenon, in turn revealing the requirement of vertical asymmetry in the structure; and (iii) that the spatial selectivity is intrinsically tied to the eigenwave encoded into the metasurface, to a degree controlled by the nonlocality length 0  of the q-BIC (a quantitative measure of the lateral interaction length, thereby setting the minimum metasurface size required to observe a Fano response, as seen in Fig. 7(c)).
While we demonstrated the utility of STCMT for understanding and studying diffractive nonlocal metasurfaces, we emphasize that this represents a unified framework capable of describing metasurfaces with or without the assumption of locality. For instance, the local assumption is recovered if only the main diagonal of the nonlocal matrices is populated, and the effects of nearest neighbors are included when more elements than just the main diagonal are nonzero. As discussed further in Supplementary Section S14, our framework may enable future work explicitly taking into account neighbor-neighbor (nonlocal) interactions-that is, to go beyond the so-called "local phase approximation" foundational to metasurface design (perhaps by extension to quasi-normal modes [65]). Additionally, our framework is readily adapted to include both local and nonlocal metasurface behaviors simultaneously, allowing rapid study and prototyping of thermal metasurfaces originally reported in Ref. [56] and recently experimentally demonstrated in Ref. [66]. Future directions may also extend our results to study the resonant dynamics of several spectrally overlapping nonlocal modes (as in Ref. [7]), leaky-wave metasurfaces based on travelling wave q-BICs [67], the effects due to partial coherence (i.e., study the effects of nonlocal scattering on the cross-spectral density [69]), and even nonreciprocal behavior [70]. Finally, Supplementary Section S15 discusses the limitations of the present study.
In summary, in this work we have extended the coupled mode theory formalism to capture the resonant physics of space-varying metasurfaces. We demonstrated excellent agreement with full-wave simulations of devices within the rapidly emerging paradigm of diffractive nonlocal metasurfaces, which are capable of wavefront and thermal emission manipulation via the physics of q-BICs. This emerging platform is capable of unprecedent control of light for applications such as augmented reality, secure optical communications, compact optical sources, and nonlinear and quantum optics. STCMT elegantly captures the essential physics of these systems, and enables rapid, analytical and numerical study and design of next-generational meta-optical system.  (1)

Spatio-temporal coupled mode theory for nonlocal metasurfaces
The coupling vector  contains the coupling coefficients from free-space excitation,  r is the (resonant) modal frequency of the q-BIC,  r is the radiative scattering rate, and  i is the nonradiative scattering rate (which we will take to be 0 throughout). We note that the scattering rates may be used interchangably with a corresponding lifetime, e.g.  = 1/ rr is the radiative lifetime of the mode. We also wish to know the outgoing (scattered) waves − s , which are also normalized so that −− ss is the outgoing intensity. We assume time- Then, if in the absence of the q-BIC the outgoing wave is given by where ) ( C is the background scattering matrix, the contribution from the decaying q-BIC means − s must be given by

43
Equations (3) and (5) form the general basis of the TCMT in the frequency domain [1], [6]. A more specific TCMT is then developed by (i) applying relevant physical constraints and (ii) parameterizing the system of interest with phenomenological parameters. In doing so, we will retrieve the scattering matrix S relating the incoming waves to the the outgoing waves, parameterized by quantities rationally controlled through the symmetry properties of q-BICs. Regarding (i), here we are interested in physical systems subject to the constraints of reciprocity, energy conservation, and time-reversal [6]. Reciprocity demands that the coupling into the q-BIC must be equivalent to coupling out, namely, Energy conservation means that the intensity coupling out of the q-BIC is constrained by the radiative decay rate, in particular,  = 2 r dd .
Finally, for time reversal invariant systems, the equations describing a decaying q-BIC with no driving (incoming) waves must, under time reversal, match the equations describing a q-BIC being populated at the radiative decay rate without scattering to any outgoing waves, which requires 44 We may now eliminate the modal amplitude in Eqns. (3) and (5), and then apply the reciprocity condition in Eqn. (7) to obtain the scattering matrix

S2a. TCMT for the device class of interest
We are interested in particular in the local scattering matrix of the form: where we include the background scattering phase factor for convenience, allowing us to write Eqn. (10) as The development of the TCMT now proceeds by (1) writing down a phenomenological form of d relevant to the geometry in Fig. S1 and then (2) applying the constraints of conservation of energy, coordinate system invariance, and time-reversal symmetry to determine the complex amplitudes in d . We begin with step (1) guided by the selection rules. Seen in Fig. S1(a), we study the q-BIC system from Refs. [3]- [55], wherein a symmetric photonic crystal slab, sandwiched between an identical superstrate and substrate, is perturbed distinctly at the top and bottom interfaces. In the absence of this perturbation, the system is two-port symmetric and the q-BIC is fully bound. The presence of a perturbation breaks this symmetry, meaning that scattering to free-space is introduced in such a way that the scattering to sides = and, without loss of generality, we may parameterize the perturbation strengths as where  0 is a measure of the overall perturbation strength and  is an angular parameter determining the relative contribution from each interface. By construction (or definition for a q-BIC), the Q-factor (or radiative lifetime) is controlled by the overall perturbation strength, and hence we expect   where the first two coefficients represent scattering directly from an interface to their respective ports and the latter two represent indirect scattering from an interface through the slab and to the opposite port. The direct scattering coefficients are not equal in magnitude because the perturbation strengths are different (parameterized by  ), but by the symmetry of the unperturbed system (and thereby, the mode profile), a symmetric mode will scatter with the same direct phase  . An anti-symmetric mode will scatter with opposite phase, which may accounted for with  . The indirect scattering coefficients [the latter two in Eqn. (18)] may be written without loss of generality in terms of their direct counterparts and a relative amplitude g and phase factor  . In our case, the phase factor is   = − /2, following the phase relation of reflected (direct) and transmitted (indirect) light in Eqn. (11) for the vertically symmetry (unperturbed) structure.
We now begin step (2) of developing the TCMT by applying physical constraints by applying the consequence of energy conservation, (19) and time-reversal symmetry, to determine the allowed forms the complex coefficients. From Eqn. (19) we have which, upon inserting Eqn. (17) and Eqn. (18), leads to  = + To simplify the final form, we now note that for the system at study here, we operate such that the resonant frequency  r is near the transmission peak of a Fabry-Perot background,  (26) And thereby, we arrive at the following form for the scattering coefficients, s ) )) )) )) ) That is, relative to the factor i , real part of the upward scattered state's Jones vector is given by the linear polarization state  with strength with our case specifically parameterized by  (30) where we note that the frequency difference  may be adjusted to account for nonradiative loss rate  i by the replacement ( ) Finally, we may simply extend this model to the momentum-frequency domain by the replacement  → () rr k and by taking the incoming and outgoing wave basis as planewaves with in-plane momenta k . Here, near normal incidence (i.e., a band edge) the resonant frequency may be approximated parabolically as where  0 is the resonant frequency at normal incidence (i.e., the band-edge frequency) and b is the Taylor expansion coefficient describing the curvature of the band near the band edge. Figure S1(b) shows the reflection calculated by the FDTD, and Figure S1(e) shows the reflection using the TCMT model fit to the response of this device. Similarly, the phase upon reflection is shown in Fig. S1(c) for FDTD and Fig. S1(f) for TCMT. This excellent agreement validates the use of the analytical form, and motivates its extension to cover the more interesting spatially varying cases shown in Ref. [2].

S1c. Circular polarization
In the main text, we are principally interested in the case where  = /4 and    =+ /2. In this case, Eqn. (28) becomes which may be written as  (36) as used in the main text.

S2a. Using a Taylor Expansion
Well within the purview of TCMT is modelling infinitely periodic structure defined by a band structure in momentum-frequency space. In this case, simple transformation yields: where we introduce the complex resonant frequency  = + 0 i , and assume that the time dependence of the broadband, background scattering is instantaneous, [6]). Next, we explicitly include the momentum dependence  (41) With Eqn. (41), Fourier transforming Eqns. (39) and (40) gives On the left-hand side, the spatial derivative terms are obtained from the well-known Fourier transform properties that the nth derivative satisfies is the Fourier transform of ( ) fx. While on the right-hand side we use the convolution theorem.

S2a. Using a cosine series expansion
Since we know ahead of time that the band structure of a q-BIC in an unperturbed nonlocal metasurface is periodic in momentum space, and is even by reciprocity, another natural choice for expanding the resonant frequency is the cosine series: where here we keep the loss constant for simplicity, n b are coefficients and  = 2 n P xn (46) for a lattice periodicity P . Then, after Fourier transformation as in the previous section, we now obtain (47) Now, keeping only the leading order terms = 0,1 n , we find Noticing that a well-known approximation for a second derivative is which becomes an equality in the limit that Given the Taylor series expansion of the cosine function, this result is well within expectation.

S3. Comparison to bare interface
At the band-edge frequency,  = 0 , the nonlocal kernel drops to a factor −1 e of its maximum value at a distance equal to This value, which we call the nonlocality length, is the characteristic distance across which the optical response is correlated for the band-edge mode. In general, if this value is sufficiently small, optical power is well-localized, and the metasurface may be considered local. For comparison, we consider scattering of s-polarized light from free-space to a bare interface of a dielectric with refractive index n , described by the Fresnel coefficient The nonlocal reflection kernel may be computed by taking the Fourier Transform of Eqn.
(52) (setting s r to 0 when  0 kk ). For comparison, Fig. S2(a,b) reproduce the Green's Function of the device from the main text, also shown in Fig. S1. Figures S2(c,d)

S4. Physical constraints
Here we show the derivation that the usual TCMT constraints, hold locally:

S4a. Equation (56)
We begin with Eqn. (56), which is derived using conservation of energy. In the case without incoming field, i.e., + = 0 s , the dynamical equation (7) in the main text becomes the Schrodinger equation (11) in the main text: (59) and the out-coupled wave is given by which carries a scattered power following Conservation of energy for such a PDE takes the form of a continuity equation x t s x dt j (62) where the "current" is given by 59 where we leave the ( ) , xt dependence implicit for brevity. Note: this is consistent with conventional quantum mechanics with the replacement = 1 and = 1/ bm .
By the chain rule, the first time in Eqn. (62) can be written Using Eqn. (59) and its complex conjugate, we obtain Insertion into Eqn. (62) gives Which, in conjunction with Eqn. (61) finally obtains

S4b. Equation (57)
Next, we obtain Eqn. (25) using time-reversal invariance and Eqn. (24). We again consider the case without an incident source, but this time consider the time evolution of the mode subject to the initial condition: ( ) ( ) where and which decays to an outgoing wave Using the well-known behavior of Eqn. (69) that ) ,, ) ( , , where  k and k are the basis wavevectors of the incoming and outgoing wavevectors, respectively, and subscripts delineate the two sides of the metasurface interface. For a lossless, reciprocal, system we have by reciprocity, )) In words, in a lossless system the sum of the reflectance and transmittance to all output momenta 1 k must be unity for any input momentum In the space frequency domain, we correspondingly have We now seek the relationship between ij S and  ij . Since we wish to compare this to the scattering matrix elements in a plane wave basis, we use the test incident plane wave in which case Eqn. (84) shows, for instance, the reflected field is simply the Fourier transform of the nonlocal reflection kernel. We may then decompose the reflected field into its constituent plane waves by an inverse Fourier transform which finally gives the reflection coefficient as That is to say, the elements of the scattering matrix may be computed by a mixed Fourier transform of the nonlocal kernels, confirming that the nonlocal kernels fully contain the information required to describe a nonlocal metasurface in the space-frequency domain.
Likewise, inverting this relation allows retrieval of the nonlocal kernel given a known scattering matrix.

S7. Matrix form on a discrete grid
Next, we show that the continuous form of the STCMT equations may be translated to discrete, or matrix form. This is done for convenience of numerical computation, and is in keeping with the tradition of metasurfaces as being built from discrete subwavelength building blocks (meta-units).
Consider, for instance, the scalar case wherein the modal parameters are invariant. For such a metasurface having meta-units at positions we are interested in the response to incident electric fields of the form where  = , after interacting with a nonlocal metasurface described by the discrete where  = () ii x and the positive (negative) sign applies to light coming from port 1 (2). For instance, in Sec. III.C the phase functions followed  x . Throughout we assume that the n meta-units are equally spaced such that The reflected and transmitted fields may be calculated by Therefore, the period a should be sufficiently small so as to shift these spurious modes out of the frequency range   of interest, which will be the case if Note the dependence on b : as the mode becomes more localized, a finer discretization is naturally required. The period a must also be small enough to sufficiently sample the maximum phase gradient of the device, as usual for metasurfaces: where  3 p N is the desired number of phase points to be sampled as phase evolves across  2 within a distance 0 P .

S8. Boundary conditions for finite and infinite metasurfaces
The results of the previous section are in practice incomplete: we must take care to specify the boundary conditions. We are interested here in two boundary conditions: (i) radiative boundaries and (ii) periodic (Bloch) boundaries. We will consider each in turn. (We note that reflective boundaries may also be of interest, for instance to model guided mode resonance gratings placed between distributed Bragg reflectors [5].) In case (i), we wish to study devices of finite size. For a finite metasurface, the energy in a q-BIC at position  x is not correlated with positions x that exist outside the metasurface.
Hence the value of  or  must vanish accordingly. Happily, this boundary condition happens naturally following the procedure of the previous section. For instance, we consider the matrix form  at the band-edge frequency for the device in Fig. S3 but having a finite width  = 100 W m . As depicted in Fig. S4(a), this device is truncated in-plane, having radiative boundaries. The absolute value of  is depicted in Fig. S4(b) [while the argument of  is shown in Fig. S4  In case (ii), we wish to capture infinite devices, in which case we should expect the deviations from a plane wave should vanish. In the periodic case we require that the energy be correlated according to contributions from every period of the device. For instance, the left edge of the period (near − /2 W ) must be closely correlated to the right edge of the period (near /2 W ), which is not captured in the above procedure or the matrix in Fig. S4(b).
Instead, we must alter the construction of  in order to include contributions from every period. For instance, for normally incident light, we sum instances of the nonlocal term shifted by integer multiples of the period W of the device:  Fig. S4(f), where we see that the main diagonal appears to 'wrap', accounting for the periodicity. Finally, Fig. S4(h) shows the normalized eigen-wave for this case, which is a plane wave with eigenvalue is Lastly, since the boundary conditions must be consistent with both the device and the illumination, for light incident with momentum  k we include the Bloch wave in each period of the summation: However, the mixed Fourier transform gives the scattering matrices as a function of both  k and k , allowing us to simply use the case without the Bloch wave seen in Eqns. (103).

S9. 2D nonlocal metasurfaces
In this section, we briefly demonstrate the extension of the STCMT to two-dimensional (2D) metasurfaces. This required extending the nonlocal kernels to be a function of both As an example, Fig. S4 For simplicity, we assume that the band structure is completely isotropic. Figure S4

S10. Nonlocal metalenses off the band edge frequency
Here, we extend our study of the nonlocal metalenses to frequencies off the band edge. We begin by computing the principle eigen-wave at three frequencies and studying how they propagate. Figure S6(a-c) shows the intensity of the reflected eigen-wave at the band edge The band-edge eigen-wave refocuses to a point =− zf , as expected, but at the two other frequencies, we see that this focal spot as split into two comatic focuses off the optical axis, where the split is larger for the case that is further off the band edge. In parallel, we compute the reflectance as a function of ( ) 00 , xz at the same three frequencies, and comparing the results [Figs. S6(d-f)] shows that the strong relationship between the eigen-wave and spatial selectivity extends off the optical axis and off the band edge. We also see that the selectivity increases drastically off the band edge, consistent with the fact that the nonlocality length increases off the band edge (due to reduced Bragg scattering).
To understand this behavior further, we study the reflectance spectra due to a modified From these results, we see that the q-BIC off the band edge is selective for waves having the characteristics of the eigen-wave at the band edge, but modified by the arbitrary wave by the appropriate sum of the inner products of the eigenvectors with the incident wave, weighted by the appropriate eigenvalues. With this perspective and in light of Fig. S7, it is immediately clear why the eigen-wave characterizes the spatial selectivity: only incident waves with large inner products (i.e., high degree of match) with the principle eigen-wave have substantial reflectance. Figure S7. Sorted eigenvalues j R for a nonlocal metalens with three different values of b , showing increased selectivity (i.e., an effectively smaller basis) to incoming waves as b increases.

S13. Thermal metasurfaces off the band-edge frequency
Here, we apply STCMT to study thermal metalenses as a function of frequency. Figure S8 depicts the computed response for three metalenses with varying Q-factors,  Figures S89(b,d,f) show spatial absorption maps at example wavelengths of interest, showing the splitting of the focal spot into two, as is Fig. S6. Interestingly, as the Q-factor grows, the off-band-edge 82 absorption diminishes rapidly. This implies having a high Q-factor not only decreases the FWHM at the band-edge frequency, but also limits the range of frequencies away from the band-edge contributing meaningful absorption.

S14. Applicability to local and nonlocal metasurface design
We identify a few key lessons applicable to the broader field of metasurfaces as follows. (i) The nonlocal kernels contain all the information of the functionality of the device, described in the space-frequency domain; (ii) the scattering matrix defined by input and output momenta contains this same information, related to the nonlocal kernels by a mixed Fourier transform; (iii) the nonlocal kernels of an aperiodic device may be approximately constructed by reference to a library of nonlocal kernels of individual structures; and (iv) the scattering matrix of the composite device is then retrievable by a suitable mixed Fourier transformation.
Additionally, we identify the lessons particular to aperiodic nonlocal metasurfaces as: (i) the nonlocal phase gradient requires vertical asymmetry in the case of unity efficiency, imparting equal and opposite momentum from above or below; (ii) the eigen-waves of the device are composed of a superposition of two waves that reflect into each upon interaction with a nonlocal metasurface; (iii) the eigen-waves closely characterize the point response function of the device, and thereby the spatial selectivity; (iv) the degree of nonlocality increases with lifetime and decreases as the band becomes flatter; (v) the spatial selectivity depends on the degree of nonlocality and the range of momenta encoded into the device; and (vi) a nonlocal metasurface may be classified as either wavefront-shaping or wavefrontselective.

S15. Limitations of the present study
Here, we discuss some limitations in these initial studies. First, we note that our analytical study, and indeed all the devices in Refs. [2], was limited to when the local reflection coefficient is near 0. This choice follows the finding in Refs. [3], [4] that this condition yields complete control over the q-BIC scattering phase. Future work is needed to go beyond this condition and will be aided by the insights of the STCMT. Second, we note that the parabolic approximation of the band structure fails when the spatial frequencies become too high, implying either more terms are necessary in the Taylor expansion and likely requiring purely numerical approaches. Third, we assumed throughout that the eigenpolarization is independent of incident angle; yet studies on BICs have demonstrated polarization vortices in the momentum-frequency domain [8]. These systems may be captured by modeling the polarization dependence and including its variance in the Fourier transforms. Fourth, we assumed that the local Fresnel coefficient was independent of incident angle (i.e., the reflection coefficient is 0 for all spatial frequencies); yet it is well known that all interfaces have unity reflectance at grazing incidence. The consequence of this assumption is the absence of highly localizes features such as those seen in Fig. S1 for a bare interface, and this simplification yielded the convenient and insightful analytical relations we found. However, if the nonlocality length is comparable or smaller than the characteristic extent of these features from the local scattering, and if we operate near grazing incidence, our approximation is invalid, and we must return to numerical methods. Finally, we remind the reader of the spurious numerical results born of sparse discretization, implying that the bandwidth of the STCMT is not unlimited and increasing it requires increasing the resolution and therefore size of the nonlocal matrices.