Single-shot ultrafast imaging attaining 70 trillion frames per second

Real-time imaging of countless femtosecond dynamics requires extreme speeds orders of magnitude beyond the limits of electronic sensors. Existing femtosecond imaging modalities either require event repetition or provide single-shot acquisition with no more than 1013 frames per second (fps) and 3 × 102 frames. Here, we report compressed ultrafast spectral photography (CUSP), which attains several new records in single-shot multi-dimensional imaging speeds. In active mode, CUSP achieves both 7 × 1013 fps and 103 frames simultaneously by synergizing spectral encoding, pulse splitting, temporal shearing, and compressed sensing—enabling unprecedented quantitative imaging of rapid nonlinear light-matter interaction. In passive mode, CUSP provides four-dimensional (4D) spectral imaging at 0.5 × 1012 fps, allowing the first single-shot spectrally resolved fluorescence lifetime imaging microscopy (SR-FLIM). As a real-time multi-dimensional imaging technology with the highest speeds and most frames, CUSP is envisioned to play instrumental roles in numerous pivotal scientific studies without the need for event repetition.

C ameras' imaging speeds fundamentally limit humans' capability in discerning the physical world. Over the past decades, imaging technologies based on silicon sensors, such as CCD and CMOS, were extensively improved to offer imaging speeds up to millions of frames per second (fps) 1 . However, they fall short in capturing a rich variety of extremely fast phenomena, such as ultrashort light propagation 2 , radiative decay of molecules 3 , soliton formation 4 , shock wave propagation 5 , nuclear fusion 6 , photon transport in diffusive media 7 , and morphologic transients in condensed matters 8 . Successful studies into these phenomena lay the foundations for modern physics, biology, chemistry, material science, and engineering. To observe these events, a frame rate well beyond a billion fps or even a trillion fps (Tfps) is required. Currently, the most widely implemented method is to trigger the desired event numerous times and meanwhile observe it through a narrow time window at different time delays, which is termed the pump-probe method 9,10 . Unfortunately, it is unable to record the event in real-time and thus only applicable to phenomena that are highly repeatable. Here, real-time imaging is defined as multi-dimensional observation at the same time as the event occurs without event repetition. It has been a long-standing challenge for researchers to invent real-time ultrafast cameras 11 .
Recently, a handful of groups presented several exciting singleshot trillion-fps imaging modalities, including sequentially-timed all-optical mapping photography [12][13][14] , frequency-dividing imaging 15 , non-collinear optical parametric amplifier 16 , frequencydomain streak imaging 17 , and compressed ultrafast photography (CUP) 18,19 . Nevertheless, none of them has imaging speeds beyond 10 Tfps. In addition, the first three methods have their sequence depths (i.e., the number of captured frames in each acquisition) limited to <10 frames because the complexity of their systems grows proportionally to the sequence depth. One promising approach is CUP 19 , which combines a streak camera with compressed sensing 18 . A standard streak camera, which has a narrow entrance slit, is a one-dimensional (1D) ultrafast imaging device 20 that first converts photons to photoelectrons, then temporally shears the electrons by a fast sweeping voltage, and finally converts electrons back to photons before they are recorded by an internal camera (see the "Methods" section and Supplementary Fig. 1). In CUP, imaging two-dimensional (2D) transient events is enabled by a wide open entrance slit and a scheme of 2D spatial encoding combined with temporal compression 19 . Unfortunately, CUP's frame rate relies on the streak camera's capability in deflecting electrons, and its sequence depth (300 frames) is tightly constrained by the number of sensor pixels in the shearing direction.
Here, we present CUSP, as the fastest real-time imaging modality with the largest sequence depth, overcoming these barriers with the introduction of multiple advanced concepts. It breaks the limitation in speed by employing spectral dispersion in the direction orthogonal to temporal shearing, extending to spectrotemporal compression. Furthermore, CUSP sets a new milestone in sequence depth by exploiting pulse splitting. We experimentally demonstrated 70-Tfps real-time imaging of a spatiotemporally chirped pulse train and ultrashort light pulse propagation inside a solid nonlinear Kerr medium. With minimum modifications, CUSP can function as the fastest single-shot 4D spectral imager [i.e., x; y; t; λ ð Þ information], empowering single-shot spectrally resolved fluorescence lifetime imaging microscopy (SR-FLIM). We monitored the spectral evolution of fluorescence in real time and studied the unusual relation between fluorophore concentration and lifetime at high concentration.

Results
Principles of CUSP. The CUSP system (Fig. 1a) consists of an imaging section and an illumination section. It can work in either active or passive mode, depending on whether a specially engineered illumination beam is required for imaging [12][13][14][15][16][17][18][19] . The imaging section is shared by both modes, while the illumination section is for active mode only. In the imaging section, after a M dynamic scene I x; y; t; λ ð Þis imaged by an interchangeable lens system, the light path is split into two. In one path, an external camera captures a time-unsheared spectrum-undispersed image (defined as u-View). In the other path, the image is encoded by a digital micromirror device (DMD), displaying a static pseudorandom binary pattern, and is then relayed to the fully opened entrance port of a streak camera (see the "Methods" section). Spatial encoding by either pseudo-random 14,18,19,21 or designed [22][23][24] patterns is a technique extensively applied in compressed sensing. A diffraction grating, inserted in front of the streak camera, spectrally disperses the scene in the horizontal direction (see Fig. 1b). After being detected by the streak camera's photocathode, the spatially encoded and spectrally dispersed image experiences temporal shearing in the vertical direction inside the streak tube first and then spatiotemporalspectrotemporal integration by an internal camera (see Fig. 1c). The streak camera, at the end, acquires a time-sheared spectrumdispersed image (defined as s-View). See Supplementary Notes 1 and 2 for the characterizations of the streak camera and the imaging section, respectively. Retrieving I from the raw images in u-View and s-View is an under-sampled inverse problem. Fortunately, the encoded recording allows us to reconstruct the scene by solving the minimization problem aided by regularization (detailed in the "Methods" section) 18,19 .
In active mode, we encode time into spectrum via the illumination section, which first converts a broadband femtosecond pulse to a pulse train with neighboring sub-pulses separated by time t sp , using a pair of high-reflectivity beamsplitters. In the following step, the pulse train is sent through a homogeneous glass rod to temporally stretch and chirp each sub-pulse. Since this chirp is linear, each wavelength in the pulse bandwidth carries a specific time fingerprint. Thereby, this pulse train is sequentially timed by t p; λ ð Þ ¼ pt sp þ η λ À λ 0 ð Þ, where p ¼ 0; 1; 2 ; ; P À 1 ð Þ represents the sub-pulse sequence, η is the overall chirp parameter, and λ 0 is the minimum wavelength in the pulse bandwidth. This timed pulse train then illuminates a dynamic scene I x; y; t ð Þ¼I x; y; t p; λ ð Þ ð Þ , which is subsequently acquired by the imaging section. See Supplementary Note 3 for experimental details on the illumination section.
In active CUSP, the imaging frame rate is determined by R a ¼ μ j j= η j jd ð Þ, where μ is the spectral dispersion parameter of the system, and d is the streak camera's pixel size. The sequence depth is N ta ¼ PB i μ j j=d, where P is the number of sub-pulses, and B i is the used spectral bandwidth of the illuminating light pulse (785 nm to 823 nm). Passive CUSP does not rely on engineered illumination, and therefore, it is well suited to image various luminescent objects 11 . The independency between t and λ allows a 4D transient, I x; y; t; λ ð Þ , to be imaged using the same algorithm without translating wavelength into time. Instead of extracting P discrete sub-pulses, passive CUSP needs to resolve N tp consecutive frames with a frame rate of R p ¼ v=d, where v is the sweeping speed of the streak tube. This calculation is based on the fact that the scene is sheared by one pixel per frame in the vertical direction (y s ) so that the time interval between adjacent frames is v=d. The number of sampled wavelengths is N λ ¼ B e μ j j=d, where B e is the bandwidth of the emission spectrum from the object. Supplementary Notes 4 and 5 contain additional information on the data acquisition model and reconstruction algorithm.
Imaging an ultrafast linear optical phenomenon. Simultaneous characterization of an ultrashort light pulse spatially, temporally, and spectrally is essential for studies on laser dynamics 4 and multimode fibers 25 . Here, in the first demonstration, we created a spatially and temporally chirped pulse by a grating pair (Fig. 2a). Negative and positive temporal chirps from a 270-mm-long glass rod and the grating pair, respectively, were carefully balanced so that the combined temporal spread t d was close to the sub-pulse separation t sp ¼ 2 ps (detailed in Supplementary Note 6). The pulse train irradiated a sample of printed letters, which is used as the dynamic scene (see its location in Fig. 1a). Exemplary frames from CUSP reconstruction with a field of view (FOV) of 12.13 mm × 9.03 mm are summarized in Fig. 2b. See the full movie in Supplementary Movie 1. It shows that each sub-pulse swiftly sweeps across the letters from left to right. Due to spatial chirping by the grating pair, the illumination wavelength also changes from short to long over time. The normalized light intensity at a selected spatial point, plotted in Fig. 2c, contains five peaks that correspond to five sub-pulses. Each peak represents one temporal point-spread-function (PSF) of the active CUSP system. The peaks have an average full width at half maximum (FWHM) of 240 fs, corresponding to 4.5 nm in the spectrum domain (i.e., spectral resolution of active CUSP). Fourier transforming the intensity in the spectrum domain to the time domain gives a pulse with a FWHM of 207 fs, indicating that our system operates at the optimal condition bounded by the timebandwidth limit and temporal chirp 13 . In the high-spectralresolution regime, the Fourier-transformation relation between pulse bandwidth and duration dominates in determining temporal resolution, and thus a finer spectral resolution broadens the temporal PSF. Whereas in the low-spectral-resolution regime, temporal chirp takes over such that a poorer spectral resolution leads to a broader temporal PSF as well. Hence, there exists an optimal spectral resolution that enables the best temporal resolution 26 .
Using a dispersion parameter μ = 23.5 μm nm −1 , a chirp parameter η ¼ 52:6 fs nm À1 and a pixel size d = 6.45 μm, our active CUSP offers a frame rate as high as 70 Tfps. A control experiment imaged the same scene using the state-of-the-art trillion-frame-per-second CUP (T-CUP) technique with 10 Tfps 18 (see Supplementary Movie 1). Our system design allows flexible and reliable transitions between CUSP and T-CUP, as explained in Supplementary Note 7 and Supplementary Fig. 10. T-CUP's reconstructed intensity evolution at the same point exhibits a temporal spread 3.2× wider than that of CUSP. In addition, within any time window, CUSP achieves 7× increase in the number of frames compared with T-CUP (see the blue solid lines in the inset of Fig. 2c). Thus, CUSP surpasses the currently fastest single-shot imaging modality in terms of both temporal resolution and sequence depth. Figure 2d plots the reconstructed total light intensities of the five sub-pulses versus the illumination wavelength. Their profiles are close to the ground truth measured by a spectrometer.
Imaging an ultrafast nonlinear optical phenomenon. Nonlinear light-matter interactions are indispensable in optical communications 27 and quantum optics 28 . Optical-field-induced birefringence, as a result of third-order nonlinearity, has been widely utilized in mode-locked laser 29 and ultrafast imaging 30,31 . Here, in our second demonstration, we focused a single 48-fs laser pulse (referred to as the 'gate' pulse), centered at 800 nm and linearly polarized along the y direction, into a Bi 4 Ge 3 O 12 (BGO) slab to induce transient birefringence, as schematically illustrated in Fig. 3a. A second beam (referred to as the 'detection' pulse)-a temporally chirped pulse train from the illumination section of the CUSP system-was incident on the slab from an orthogonal direction, going through a pair of linear polarizers that sandwich the BGO. This is a Kerr gate setup since the two polarizers have polarization axes aligned at +45°and -45°, respectively 31,32 . The Kerr gate has a finite transmittance of T Kerr ¼ 1 À cos φ ð Þ =2 only where the gate pulse travels. Here, φ, proportional to the gate pulse intensity, represents the gate-induced phase delay between the two orthogonal polarization directions x and y (see Supplementary Note 8 for its definition and measurement).
CUSP imaged the gate pulse, with a peak power density of 5.6 × 10 14 mW cm −2 at its focus, propagating in the BGO slab. In the first and second experiments, the gate focus was outside and inside the FOV (2.48 mm × 0.76 mm in size), respectively. Figures 3b and c contain 3D visualizations of the reconstructed dynamics, which are also shown in Supplementary Movie 2. Snapshots are shown in Figs. 3d and e. As the gate pulse travels and focuses, the accumulated phase delay φ increases, therefore T Kerr becomes larger. The centroid positions of the gate pulse (i.e., the transmission region in the Kerr medium) along the horizontal axis x versus time t are plotted at the bottom of Figs. 3b and c, matching well with the theoretical estimation based on a refractive index of 2.07. Note that seven sub-pulses were included in the illumination to provide a 14-ps-long observation window and capture a total of 980 frames with an imaging speed of 70 Tfps. Based on the definition of N ta , each 2-ps sub-pulse encodes 140 frames in its spectrum, and seven sub-pulses arranged in sequence offer 980 frames in total.
Re-distribution of electronic charges in BGO lattices driven by an intense light pulse, like in other solids, serves as the dominant mechanism underlying the transient birefringence 30,33 , which is much faster than re-orientation of anisotropic molecules in liquids, such as carbon disulfide 14,15 . To study this ultrafast process, one spatial location from Fig. 3d is chosen to show its locally normalized transmittance evolution (Fig. 3f). Its FWHM of 455 fs estimates a relaxation time of~380 fs after deconvolution from the temporal PSF (Fig. 2c). This result is close to the BGO's relaxation time reported in the literature 33 . Note that T-CUP fails in quantifying this process due to its insufficient temporal resolution (see Supplementary Movie 3 and Supplementary Note 7). In stark contrast with the well-established pump-probe method, CUSP requires only one single laser pulse to observe the entire time course of its interaction with the material in 2D space. As shown in Supplementary Note 8, the Kerr gate in our experiment was designed to be highly sensitive to random fluctuations in the gate pulse intensity, which are caused by the nonlinear relation between the Kerr gate transmittance and the gate pulse intensity. The experimental comparison in Supplementary Movie 3 reveals that the pump-probe measurement flickers conspicuously, due to shot-to-shot variations, while CUSP exhibits a smooth transmittance evolution, owing to single-shot acquisition. Supplementary Fig. 12 shows that the fractional fluctuation in intensity is amplified 11 times in transmittance. As detailed in Supplementary Note 8, the pump-probe method would require >10 5 image acquisitions to capture the dynamics in Fig. 3b with the same stability as CUSP's.
SR-FLIM. One application of passive CUSP is SR-FLIM. Both the emission spectrum and lifetime are important properties of molecules, which have been broadly exploited by biologists and material scientists to investigate a variety of biological processes 34 and material characteristics 35 . Over the past decades, timecorrelated single photon counting (TCSPC) has been the gold-  36,37 . However, TCSPC typically takes tens of milliseconds to even seconds to acquire one dataset, since it depends on repeated measurements. To our knowledge, singleshot SR-FLIM has not been reported so far.
Our experimental implementation, illustrated in Fig. 4a, is a fluorescence microscope interfacing the imaging section of the CUSP system (detailed in the "Methods" section and Supplementary Note 9). This system provides a spectral resolution of 13 nm over the 200-nm bandwidth. A single 532-nm picosecond pulse was deployed to excite fluorescence from the sample of Rhodamine 6G dye (Rh6G) in methanol. Three Rh6G concentrations (22, 36, and 40 mM) with three different spatial patterns were imaged and reconstructed at 0.5 Tfps. The final data has an FOV of 180 μm × 180 μm, contains N tp = 400 frames over an exposure time of 0.8 ns, and N λ ¼ 100 wavelength samples. Supplementary Movie 4 shows the reconstructed light intensity evolutions in three dimensions (2D space and 1D spectrum). Fluorescence lifetime can be readily extracted by singleexponential fitting. Figures 4b-d summarize the spatio-spectral distributions of lifetimes. Rh6G with a higher concentration has a shorter lifetime due to increased pathways for non-radiative relaxation 38 . The spatial intensity distributions (insets of Fig. 4bd) show well-preserved spatial resolutions. Figure 4e plots the intensity distribution of the 22-mM sample in the t-λ space, clearly revealing that the emission peaks at~570 nm and decays rapidly after excitation. Finally, we show in Fig. 4f that lifetimes remain relatively constant over the entire emission spectra and exhibit minute variations over the spatial domain. These uniform spatial distributions are also confirmed by the spectrally averaged lifetime maps in Fig. 4g. See Supplementary Fig. 14 for more quantitative results from our SR-FLIM.
Contrary to the common observation that fluorescence lifetime is independent of concentration 3,39 , our experiments demonstrate that lifetime can actually decrease with an increased concentration. Such a phenomenon was also observed in a previous study 38 , attributed to the populated non-radiative relaxations. As the sample becomes highly concentrated, the fluorophores tend to stay at the excited state for a shorter time since the formations of dimers and aggregates increase pathways for relaxation (see Supplementary Note 9) 38 . To directly characterize our samples' lifetimes, we conducted a reference experiment using traditional streak camera imaging [40][41][42] . A uniform sample is projected onto the streak camera with a narrow entrance slit and then the lifetime is readily extracted from the temporal trace of the emission decay in the streak image. The results are plotted in Supplementary Figs. 14d and e. The difference in lifetime measurements between CUSP and the reference experiment is only 10 ps on average, much less than the lifetime.

Discussion
CUSP's superior real-time imaging speed of 70 Tfps in active mode is three orders of magnitude greater than the physical limit of semiconductor sensors 43 . Owing to this new speed, CUSP can quantify physical phenomena that are inaccessible using the previous record-holding system (see Supplementary Movie 3). Moreover, active CUSP captures data more than 10 5 times faster than the pump-probe approach. When switching CUSP to passive mode for single-shot SR-FLIM, the total exposure time for one acquisition (<1 ns) is more than 10 7 times shorter than that of TCSPC 36,37 . Additionally, CUSP is to date the only single-shot ultrafast imaging technique that can operate in either active or passive mode. As a generic hybrid imaging tool, CUSP's scope of application far exceeds the demonstrations above. The imaging speed and sequence depth can be highly scalable via parameter tuning. CUSP can cover its spectral region from X-ray to NIR 20 , and even matter waves such as electron beams 24 , given the availability of sources and sensing devices. In addition, CUSP is advantageous in photon throughput, compared with existing ultrafast imaging technologies 11 . Both the pump-probe and TCSPC methods require event repetition. Consequently, these techniques are not only slower than CUSP by orders of magnitude as aforementioned, but are also inapplicable in imaging the following classes of phenomena: (1) high-energy radiations that cannot be readily pumped such as annihilation radiation (basis for PET) 44 , Cherenkov radiation 45 , and nuclear reaction radiation 6 ; (2) self-luminescent phenomena that occur randomly in nature, such as sonoluminescence in snapping shrimps 46 ; (3) astronomical events that are light-years away 44,47 ; and (4) chaotic dynamics that cannot be repeated 48,49 . Yet, CUSP can observe all of these phenomena. For randomly occurring phenomena, the beginning of the signal can be used to trigger CUSP. Large amounts of theoretical and technical efforts are required before CUSP can be widely adapted for these applications.
A trade-off between imaging speed and recording time is always found in single-shot ultrafast imaging modalities 7,8,[12][13][14][15][16][17][18][19][24][25][26] . Due to the vast discrepancies in imaging speeds, it is inconvenient to compare different approaches using recording time. Therefore, we introduced a parameter for fair comparisonsequence depth 11 , which is independent of imaging speeds. In CUSP with a fixed hardware configuration, the maximum sequence depth is solely determined by the FOV [see Eqs. (5) and (6) in the "Methods" section]. In a practical setting, active CUSP can acquire more than 10 3 frames in one snapshot, which is several times 7,18,19 or even orders of magnitude 8,12-17 higher than those achievable by the state-of-the-art techniques (see the "Methods" section). Additionally, this interplay between sequence depth and FOV limits the maximum FOVs for scenarios where the desired sequence depths are given [see Equations (7) and (8) in the "Methods" section].
Compressed-sensing-enabled single-shot imaging typically has to pay the penalty of compromised spatial resolutions 14,18,21,22,50 that are caused by the multiplexing of multi-dimensional information and finite encoding pixel size. A characterization experiment (see Supplementary Note 10 and Supplementary Fig. 15) suggests a 2-3× degradation in spatial resolutions in CUSP. Advanced concepts, such as lossless encoding 7 and multi-view projection 51 , are promising in compensating for this resolution loss. The success of compressed sensing relies on the premise that the unknown object is sparse in some space. For CUSP, this precondition is justified via calculations in Supplementary  Table 1. All the scenes in this work have data sparsity >90%, which is sufficient to lead to satisfactory reconstructions, according to a former study 52 . This restriction on sparsity may be alleviated by optimizing the encoding mask 23,53 or the regularizer 54 in Equation (2).
CUSP's temporal resolution in active mode is essentially timebandwidth limited 13 . Alternative encoding mechanisms, such as polarization encoding, may be explored to break this barrier and boost the imaging speed to the 10 15 -fps regime. The imaging speed of passive CUSP is confined by the streak camera. In the past, streak camera technology has been a powerful workhorse in science and engineering 20,[40][41][42]55 . However, it suffers from the low quantum efficiency, space-charge effect, and intrinsic electronic jitter. Most importantly, boost in its imaging speed is at the mercy of the development of faster electronics. We envision that optical shearing that avoids the complex photon-electron conversions could bring forth the next leap in ultrafast imaging. Recent progress in machine learning may facilitate image reconstruction 56,57 . All of these directions represent only a small portion of the overall efforts in pushing the boundaries of ultrafast imaging technologies.

Methods
Setups and samples. In the imaging section of the active CUSP system (Fig. 1a), the light path of s-View starts by routing the intermediate image, formed by the interchangeable imaging system, to the DMD by a 4f imaging system. A static pseudo-random binary pattern with a non-zero filling ratio of 35% is displayed on the DMD. The encoded dynamic scene is then relayed to the entrance port of the streak camera via the same 4f system. In order to maximize photon utilization efficiency of s-View, a dual-projection scheme was adopted instead in the passive CUSP system for SR-FLIM (see Supplementary Note 9 for details). In the experiment in Fig. 2a, the group of letters were printed on a piece of transparency film to impart complex spatial features. The BGO slab (MTI, BGO12b101005S2) used in the experiments in Fig. 3 has a thickness of 0.5 mm. Its edges were delicately polished by a series of polishing films (Thorlabs) to ensure minimal light scattering for optimal coupling of the gate pulse. In the fluorescence microscopy setup in Fig. 4a, a single excitation pulse was focused on the back focal plane of an objective to provide wide-field illumination in the FOV. The Rh6G solution was masked by a negative USAF target, placed at the sample plane. A dichroic mirror and a long-pass filter effectively blocked stray excitation light. After a tube lens, the image was directed into the imaging section of the passive CUSP system. Fig. 1a includes a 50/50 non-polarizing beamsplitter (Thorlabs, BS014), a DMD (Texas Instruments, LightCrafter 3000), an external CCD camera (Point Grey, GS3-U3-28S4M), 150-mm-focal-length lenses (Thorlabs, AC254-150-B), a 300 lp mm −1 transmissive diffraction grating (Thorlabs, GTI25-03), and a streak camera (Hamamatsu, C6138). In the illumination section, a 270-mm-long N-SF11 glass rod (Newlight Photonics, two SF11L1100-AR800, SF11G1500-AR800, combined with SF11G1200-AR800) was used for the experiment in Fig. 2, while a 95-mm-long N-SF11 glass rod (Newlight Photonics, SF11G1500-AR800, SF11G1400-AR800, combined with SF11G1050-AR800) was used for the experiments in Fig. 3. A femtosecond laser (Coherent, Libra HE) was used as the light source in active CUSP. A picosecond laser (Huaray, Olive-1064-1BW) was used in passive CUSP for fluorescence excitation.
The DMD spatially encodes the scene by turning each micromirror to either +12°(ON) or -12°(OFF) from the DMD's surface normal. Each micromirror has a flat metallic coating and reflects the incident light to one of the two directions. Therefore, we can collect the encoded scene either in a retro-reflection mode by tilting the DMD by 12°, which is used in active CUSP (Fig. 1a), or by using two separate sets of relay optics, which is used in passive CUSP (see Supplementary  Fig. 13a). Here, one DMD code has a lateral dimension of 56.7 μm × 56.7 μm. Thus, the relay optics with~0.08 NA has enough spatial resolution to image the DMD onto the streak camera.
As schematically detailed in Supplementary Fig. 1, in the streak camera, the image of the fully opened entrance slit is first relayed to the 3-mm-wide photocathode by the input optics. Then the photocathode converts photons into photoelectrons, which are accelerated through an accelerating mesh. The streak camera can operate in two modes: focus mode and streak mode. In the focus mode, no sweep voltage is applied so that an internal CCD camera (Hamamatsu, Orca R2) only captures time-unsheared images, akin to a conventional sensor. In the streak mode, these photoelectrons experience a temporal shearing on the vertical axis driven by an ultrafast linear sweep voltage. The highest sweeping speed is 100 fs per pixel, equivalently 10 THz 18,20 . The photoelectron current is amplified by a microchannel plate via the generation of secondary electrons. After a phosphor NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-15745-4 ARTICLE NATURE COMMUNICATIONS | (2020) 11:2091 | https://doi.org/10.1038/s41467-020-15745-4 | www.nature.com/naturecommunications screen converts the electrons back to photons, an internal CCD camera captures a single 2D image of the photons.
Image acquisition and reconstruction. We denote the optical energy distributions recorded in u-View and s-View as E u and E s , which are related to the dynamic scene I x; y; t; λ ð Þby where C represents the spatial encoding by the DMD; F u and F s describe the spatial low-pass filtering due to the optics in u-View and s-View, respectively; D represents image distortion in s-View with respect to the u-View; S λ denotes the spectral dispersion in the horizontal direction;Q u and Q s are the quantum efficiencies of the external CCD and the photocathode of the streak camera, respectively; S t denotes the temporal shearing in the vertical direction; T represents spatiotemporal-spectrotemporal integration over the exposure time of each CCD; and α is the experimentally calibrated energy ratio between the streak camera and the external CCD. Here, we generalize the intensity distributions of the dynamic scenes observed by both active CUSP and passive CUSP as I x; y; t; λ ð Þfor simplicity. The concatenated form of Eq. (1) is E ¼ OI x; y; t; λ ð Þ , where E ¼ E u ; αE s ½ and O stands for the joint operator.
By assuming the spatiotemporal-spectrotemporal sparsity of the scene and calibrating for O, I x; y; t; λ ð Þcan be retrieved by solving the inverse problem defined as 18,58 In Equation (2), argmin represents the argument that minimizes the function in the following bracket. The first term denotes the discrepancy between the solution I and the measurement E via the operator O and Á k k 2 represents L 2 norm. The second term enforces sparsity in the domain defined by the following regularizer ϕ(I) while the regularization parameter ξ balances these two terms. We opt to use total variation (TV) in the four-dimensional x-y-t-λ space as our regularizer. For an accurate and stable reconstruction, a software program adapted from the twostep iterative shrinkage/thresholding (TwIST) algorithm 58 was utilized. More details on data acquisition and reconstruction can be found in Supplementary Notes 4 and 5. In addition, the assumption of data sparsity is justified in Supplementary Table 1. CUSP's reconstruction of a data matrix of dimensions N x N y N ta (active mode) or N x N y N tp N λ (passive mode) requires a 2D image of N x N y in u-View and a 2D image of N col N row in s-View. In active mode, N col ¼ N x þ ðN ta =PÞ À 1; ð3:1Þ N row ¼ N y þ vt sp =d P À 1; ð3:2Þ in passive mode, N col ¼ N x þ N λ À 1; ð4:1Þ N row ¼ N y þ N tp À 1: ð4:2Þ In Equations (3) and (4), P is the number of sub-pulses, v is the streak camera's shearing speed, t sp is the temporal separation between adjacent sub-pulses, and d is the streak camera's pixel size. The finite pixel counts of the streak camera (N sc x N sc y ¼ 672 512 after 2 × 2 binning) physically restrict N col ≤ 672 and N row ≤ 512. In active CUSP imaging shown in Fig. 2, a raw streak camera image of 609 × 449 pixels was required to reconstruct a data matrix of N x N y N ta ¼ 470 350 700. Similarly, in Fig. 3, reconstruction of a data matrix of N x N y N ta ¼ 310 90 980 requires an image of 449×229 pixels from the streak camera. Note that here N ta =P equals the number of wavelength samples within one sub-pulse, which is 140 in our active CUSP system. In the passive-mode imaging shown in Fig. 4, an SR-FLIM data matrix of N x N y N tp N λ ¼ 110 110 400 100 is associated with a streak camera image of 209×509 pixels.
Limits in sequence depth and FOV. Based on Equations (3) and (4) above, when assuming fixed FOVs, there are upper limits in the sequence depth achievable by CUSP. It is straightforward to derive this limit for active mode: In Eq. (5), 140 represents the number of wavelength samples in one sub-pulse. N max ta is fundamentally determined by how many sub-pulses (P) can be accommodated in temporal shearing. For passive mode, the maximum sequence depth is simply Taking practical numbers for example, when using the 70 Tfps active CUSP to image a scene of N x N y ¼ 350 350 pixels, we can obtain a maximum sequence depth N max ta ¼ 1120 frames. This number is more than 3 times the maximum sequence depth obtained in T-CUP 18 and more than 18 times that of the best single-shot femtosecond imaging modality other than CUP 17 . The recording time in this case is 16 ps. For the 0.5 Tfps passive CUSP, if the scene is N x N y ¼ 100 100, then N max tp ¼ 413 frames, which corresponds to a recording time of 826 ps.
Similarly, for scenarios with both sequence depth and number of wavelength samples fixed, the streak camera limits the spatial pixel counts in the reconstructed data in active mode to be N x ≤ N sc x À N ta =P ð Þþ1 and N y ≤ N sc y À vt sp =d P þ 1.
In passive mode, these limits are N x ≤ N sc x À N λ þ 1 and N y ≤ N sc y À N tp þ 1. Considering the camera pixel size d and the magnification of the imaging optics M, we can obtain the maximum FOV for active mode: Inserting practical numbers into Equations (7) and (8) and using the configurations of our three demonstrations (Figs. 2-4), we have the maximum FOVs of 13.75 mm × 10.66 mm, 4.30 mm × 3.00 mm, and 0.92 mm × 0.18 mm, respectively.

Data availability
The data that support the findings of this study are available from the corresponding author on reasonable request.