## Introduction

It has become increasingly clear that understanding morphogenesis and disease requires three-dimensional (3D) tissue cultures and models1. Effective 3D imaging techniques, capable of reporting on subcellular as well as multicellular scales, in a time-resolved manner, are crucial for achieving this goal2. Although the light microscope has been the main tool of investigation in biomedicine for four centuries, the current requirements for 3D imaging pose new, difficult challenges. Owing to their insignificant absorption in the visible spectrum, most living cells exhibit very low contrast under light microscopy. As a result, fluorescence microscopy has become the main tool of investigation in cell biology3. Due to the significant progress in designing fluorescence tags, structures in the cell can now be imaged with high specificity.

More recently, super-resolution microscopy methods based on fluorescence have opened new directions of investigation, toward nanoscale subcellular structure4. However, fluorescence imaging is subject to several limitations. Absorption of the excitation light may cause the fluorophore to irreversibly alter its molecular structure and stop fluorescing. This process, known as photobleaching, limits the time interval over which continuous imaging can be performed5. The excitation light is typically toxic to cells, a phenomenon referred to as phototoxicity6. Overcoming these limitations becomes extremely challenging7 , 8 when imaging thick objects over an extended period of time9, 10 as acquiring data over the time and the axial dimension increases exposure of the specimen to the excitation light, lowering its viability. Confocal11 and two-photon fluorescence microscopy12 have been the mainstay tools of imaging thick 3D specimens. Although these methods can provide excellent sectioning through tissue, due to the focused, short wavelength excitation, the amount of power required may be harmful. Thus, recent advances in light sheet microscopy were dedicated specifically to reducing phototoxicity and photobleaching13,14,15,16.

Label-free microscopy provides an alternative solution to overcoming these limitations, albeit at the expense of molecular specificity. Two classical methods are phase contrast (PC) microscopy17 and differential interference contrast (DIC) microscopy18. The contrast in these methods is generated by visualizing the modifications of the wavefront when light propagates through the sample. Unfortunately, both PC and DIC are qualitative—that is to say, they do not measure the wavefront deformation quantitatively, and the image recorded on the detector is often substantially different from the scattering potential of the object. This deformation is characterized by a spatially dependent phase shift, defined as ϕ(r) = (2π/λ o)h(rn(r), where λ o is the central wavelength of the illumination, h(r) and Δn(r) are the sample thickness and refractive index difference, both evaluated at the transverse coordinate r, respectively.

Quantitative phase imaging (QPI) is an approach focused precisely on quantifying this phase shift19. Along these lines, Cogswell et al.20 proposed DIC with geometrically-induced phase shifting, applied for two-dimensional (2D) imaging. DIC with two orthogonal shear directions has been used to obtain 2D quantitative phase images21, 22, whereas Mehta et al.23 reported a partially coherent model for DIC 2D imaging. Shribak et al.24 used liquid crystal modulators to change the polarization directions and phase shift modulation. This setup allows them to record the 2D phase-gradient information at two orthogonal directions and reconstruct the optical phase front.

QPI has recently gained significant scientific interest, especially in the biomedical field19, thanks to several advancements. For example, common-path interferometry replaced traditional interferometry for better stability and sensitivity25,26,27,28. Low temporal coherence illumination methods significantly improve image resolution when suppressing speckles29,30,31,32. An interesting direction of study is using QPI to extract scattering information from extremely weakly scattering objects33. This approach is referred to as Fourier transform light scattering (FTLS), a spatial analog to Fourier transform (infrared) spectroscopy34. The idea is that the knowledge of amplitude and phase of an image field allows us to numerically propagate that field to any plane, including the far field, where angular scattering measurements are typically performed. For weakly scattering objects such as live cells, it is much more signal-effective to perform the measurement at the image plane, where all scattering angles overlap at each point, rather than measuring angle-by-angle in the far field. As a result, QPI can be used to solve inverse scattering problems and extract the 3D structure of inhomogeneous objects35. Three-dimensional information of the specimen is accessible by measuring the phase across multiple angles of the illumination or axial specimen positions36,37,38,39. An equivalent approach is fixing the illumination direction while rotating the sample to obtain phase maps from different viewing angles. For example, see Merola et al.40 Lens-free holography method is also used to obtained tomographic information on a chip by combining with multiple illumination angles41.

However, imaging optically thick, multiple scattering specimens is still challenging for any optical method, including QPI. The fundamental obstacle is that multiple scattering generates an incoherent background, which ultimately degrades the image contrast. An imaging method dedicated to imaging these thick specimens must include a mechanism to subdue the multiple scattering backgrounds and exhibit strong sectioning to suppress the out-of-focus light. To overcome these challenges, here we introduce a new QPI method, referred to as gradient light interference microscopy (GLIM). GLIM combines DIC microscopy with low-coherence interferometry and holography. In GLIM, the two interfering fields are identical except for a small transverse spatial shift. This geometry ensures that the two fields suffer equal degradation due to multiple scattering. By accurately controlling the phase shift between the two waves, we acquire multiple intensity images, which have the same incoherent background, but different coherent contributions. As a result, GLIM rejects much of the multiple scattering contributions and yield high contrast of thick objects. Furthermore, the illumination condenser aperture is fully open, which lands GLIM very strong optical sectioning. GLIM can provide tomographic imaging of both thin samples, for example, single cells, and thick specimens, such as multicellular systems. Below, we present the principle of GLIM operation, validation results on test samples, and time-resolved tomography of cells in culture, as well as embryo development.

## Results

### GLIM principle

GLIM is an add-on module to a commercial DIC microscope as shown in Fig. 1a. Via a Wollaston prism, a typical DIC microscope generates two replicas of the image field, cross-polarized, shifted transversely by a distance smaller than the diffraction spot. We removed the analyzer that normally renders the two polarizations parallel in DIC and, instead, let the fields enter the GLIM module. These fields are spatially Fourier transformed by the lens L1 at its back focal plane. A spatial light modulator (SLM), placed at this plane with its active axis aligned to the polarization direction of one field, retards its phase by ϕ n  = /2 with n = 0, 1, 2, 3, and leaves the other field unmodified. Both fields are Fourier-transformed again by lens L2 to generate the image at the camera plane. A linear polarizer, P1, is aligned at 45° with respect to both polarizations to render them parallel. The resulting field at the detector is a coherent superposition of these two fields, namely,

$${U_n}\left( {\bf r} \right) = U\left( {\bf r} \right) + U\left( {{\bf r} + \delta {\bf r}} \right) {{\rm e}^{i{\phi _n}}},$$
(1)

where $$\delta {\bf{r}} = \delta x{{\hat {\bf x}}}$$ is the spatial offset between the two fields and U is the image field. The intensity at each phase shift, I n (r) = |U n (r)|2, can be written as

$${I_n}\left( {\bf{r}} \right) = I\left( {\bf{r}} \right) + I\left( {{\bf{r}} + \delta {\bf{r}}} \right) + 2\left| {\gamma \left( {{\bf{r}},\delta {\bf{r}}} \right)} \right|cos\left[ {\phi \left( {{\bf{r}} + \delta {\bf{r}}} \right) - \phi \left( {\bf{r}} \right) + {\phi _n}} \right],$$
(2)

where I(r) and ϕ(r) are, respectively, the intensity and phase of the image field, and γ is the mutual intensity or the temporal cross-correlation function at zero-delay between these two fields, γ(r,δ r) = 〈U *(r)U(r + δ r)〉 t . The quantity ϕ n  = /2 is the modulated phase offset between the two fields, externally controlled by the SLM. From the four intensity images, I n , with n = 0, 1, 2, 3 (Fig. 1b), we are able to solve for I(r), |γ(r,δ r)|, and Δϕ(r) = ϕ(r + δ r)−ϕ(r). These data render quantitatively the gradient of the phase along the direction of the shift (Fig. 1c), x ϕ(r) ≈ Δϕ(r)/δx. Details on the optical setup, procedures for extracting the phase gradient and estimating δx can be found in the “Methods” section. Before running the experiments, the SLM needs a one-time calibration to ensure proper phase modulation. The calibration procedure and pixel-to-pixel variation of the SLM are described in details in Supplementary Notes 1, 2, and Supplementary Fig. 1.

### QPI using GLIM

To demonstrate the capability of GLIM to extract quantitatively the phase gradient, we imaged 4.5 ± 5% µm polystyrene micro-beads (Polysciences Inc.), with a refractive index value of 1.59 at the central wavelength. The beads are immersed in immersion oil (Zeiss Inc.) with a refractive index value of 1.518 to generate a total phase shift of 3.87 radians. Figure 1c shows the measured phase gradient at NAcon = 0.09, where the subscript con stands for condenser. Given the phase gradient, x ϕ, one can integrate along the gradient direction to get phase value, ϕ(r), using

$$\phi \left( {x,y} \right) = \mathop {\int}\limits_0^x {\left[ {{\nabla _x}\phi \left( {x',y} \right)} \right]} \,{\rm{d}}x' + \phi \left( {0,y} \right),$$
(3)

where ϕ(0,y) is the initial value, which can be obtained with some prior knowledge of the specimen. For example, if (0,y) is a background location, the phase ϕ(0,y) should be set to 0 radians. Figure 1d shows the quantitative phase map, ϕ(r), and Fig. 1e displays a line profile through the center of the bead. Note that our integration result matches well with the expected ground truth, where optical diffraction is taken into account.

### GLIM imaging of cell cultures

Due to the low phototoxicity, absence of photobleaching, and easy sample preparation, transmitted light modalities appear to be ideal for studying cell growth and proliferation42. Yet, such assays are most frequently conducted with the aid of labels. Although specificity granted by external markers is crucial for certain applications, quantifying cell growth over longer timescales remains a challenge43. It has been known for some time that indicators of cell proliferation do not have equal growth44. More recently, new approaches have been demonstrated using vibrating hollow cantilevers to weigh cell passing through45. This method is limited to non-adherent cells. To expand the principle of this measurement to adherent cells, a method based on vibrating pedestals was demonstrated, at the expense of mass sensitivity46.

We show that GLIM is able to quantify the growth and proliferation of large populations of adherent cells over extended periods of time. Specifically, we can characterize the culture by extracting parameters such as single cell mass, volume, surface area, while simultaneously measuring the intracellular transport on timescales associated with the cell cycle. Supplementary Note 3 provides details on 2D image formation in GLIM. Figure 2a shows scanning GLIM data of HeLa cells in culture over a 4.48 × 5.54 mm2 field of view. This image is a mosaic consisting of 16 × 20 individual frames, imaged by a 40×/0.75 NA objective and a condenser aperture adjusted to NAcon = 0.32. Figure 2b and c shows magnified views at different scales for a region denoted by the white box in Fig. 2a. The acquisition took ~3 min for each of the 16 × 20 mosaics, which was assembled into a time-lapse sequence following the procedure outlined in the “Methods” section, Supplementary Note 4, and Supplementary Figs. 3 and 4. The GLIM image (Fig. 2c) provides a quantitative phase-gradient map at the spatial resolution of the objective, clearly showing fine structures such as nucleoli pointed by white arrows. We acquired 38 such large fields of view, over a 10 h time interval (see the Supplementary Movie 1 for a time-lapse sequence).

To measure growth rates, the phase values are obtained by integrating the phase gradient at each frame in the time-lapse sequence using Eq. (3). These phase values are used to calculate the cell dry mass, using the linear relationship between the optical path-length map of a cell and its dry mass density, ρ, see Barrer47. The black dotted profile in Fig. 2d shows the normalized dry mass to that at time t = 0. Fitting this profile to the function 2t/DT, where DT is the mass doubling time constant, we found that DT ≈ 36 h. Interestingly, this time is approximately 50% longer than the typical cell count-doubling time as cells can divide without doubling in mass48. The fitted profile is shown in red in Fig. 2d.

To investigate the physics of cellular mass transport, we use the dispersion phase spectroscopy (DPS) method49. This approach is powerful in extracting spatiotemporal fluctuation information from time-lapse sequence of phase maps, as it requires no manual tracing, making it well suited for fully automated applications50. In DPS, one computes the dispersion relation of the dynamic system, connecting the spatial and temporal frequencies. The behavior of this dispersion curve informs on the nature of the transport (such as, comparisons of diffusion vs. deterministic dominated transport) and numerical fits yield the diffusion coefficient, and the width of the velocity distribution. Let us start by describing the dry mass-density fluctuation, ρ(r ,t), via an advection-diffusion equation, namely,

$$D{\nabla ^2}\rho \left( {{{\bf{r}}_ \bot },t} \right) - {\bf{v}}.\nabla \rho \left( {{{\bf{r}}_ \bot },t} \right) - \partial \rho \left( {{{\bf{r}}_ \bot },t} \right)/\partial t = 0,$$
(4)

where r  = (x,y) is the 2D coordinate vector, v is the advection velocity vector, and D is the average diffusion coefficient. Using this equation, we obtain the temporal autocorrelation function evaluated at each spatial frequency, k , and temporal delay, τ, defined as $$g\left( {{{\bf{k}}_ \bot },\tau } \right) = {\left\langle {\tilde \rho \left( {{{\bf{k}}_ \bot },t} \right){{\tilde \rho }^*}\left( {{{\bf{k}}_ \bot },t + \tau } \right)} \right\rangle _t}/{\left\langle {\tilde \rho \left( {{{\bf{k}}_ \bot },t} \right){{\tilde \rho }^*}\left( {{{\bf{k}}_ \bot },t} \right)} \right\rangle _t}$$. Here, $$\tilde \rho \left( {{{\bf{k}}_ \bot },t} \right) = {{\rm F}_{{{\bf{k}}_ \bot }}}\left[ {\rho \left( {{{\bf{r}}_ \bot },t} \right)} \right]$$ is the 2D spatial Fourier transform of the dry mass density. We calculated g(k ,τ) directly from the phase gradient ϕ instead of the integrated phase ϕ (Supplementary Note 5), namely

$$g\left( {{{\bf{k}}_ \bot },\tau } \right) = \exp \left( {i{{\bf{v}}_{\rm{o}}}.{{\bf{k}}_ \bot }\tau } \right)\exp \left[ { - \left( {\Delta v{k_ \bot } + Dk_ \bot ^2} \right)\tau } \right],$$
(5)

where v o is the mean and Δv the standard deviation of the velocity distribution. At each transverse spatial frequency, k , one can fit the measurement of g(k ,τ) using Eq. (5) to estimate Δv and the diffusion coefficient, D. We found that v o is negligible for the duration of the experiment or that no dominant velocity vector, v o,was detected in our data. The decay rate of g(k ,τ) at each spatial mode k satisfies

$$\Gamma \left( {{{\bf{k}}_ \bot }} \right) = \Delta v{k_ \bot } + Dk_ \bot ^2.$$
(6)

The first term in Eq. (6) dictates the active transport, whereas the remaining term characterizes diffusion. The black dot profile in Fig. 2e is our measured Γ(k ). Fitting this profile to Eq. (6), we found that active transport dominated on cellular scales (10–50 μm), with a spread in velocities of Δν ≈ 58 nm min−1. Fitted profiles are shown in red.

### Tomography of single cells using GLIM

As a result of the high numerical aperture of the illumination, GLIM has strong sectioning capabilities, which yields tomographic imaging of both thin and thick samples. We apply GLIM tomography to a 30% confluence HeLa cell culture over 7.7 h. Seven fields of view (FOVs) were imaged using a 63×/1.4 NA objective with a spatial sampling rate of 10.8 pixels µm−1. Each FOV was scanned every 22 min. For each time point, the sample is scanned over a total depth of 28 µm with a step size of Δz = 0.07 µm. Figures 3a and b show the xy and xz cross-sections of the GLIM measurement, namely the quantitative phase gradient ϕ. To remove the background due to weak sectioning at small scattering angles, we perform a spatial high-pass filtering operation, as described in the “Methods” section. Figures 3c and d show the corresponding xy and xz cross-sections of the filtering, with the yellow arrows pointing to the locations of the nucleus. Clearly, the xz cross-section of the tomograms shows significant improvements in depth sectioning. Compared to the phase-gradient image, x ϕ, this cross-section has no diffraction streaks or shadow artifacts, while preserving clear cell boundaries. Figures 3e–k show the GLIM tomograms obtained via filtering, at seven different time points. The cell nuclei were segmented and shown in orange, whereas the cell membranes displayed in green using isosurface rendering. The rendered images clearly show how the 3D shape of the cell changes over time. It can be further seen that during mitosis (the 110 and 264 min. frames), the cells assumed a spherical shape (pointed by yellow arrows in Fig. 3g). Also, at the 110-min point, while forming a mitotic sphere, the cells appear to leave behind biomass (white arrow in Fig. 3g) that is adherent to the substrate, consistent with previous observations51. Rendered images for the whole time series can be found in Supplementary Movie 2.

Thanks to the depth sectioning of GLIM, we can automatically segment different cells in the volume of interest (Supplementary Note 6; Supplementary Fig. 5) to compute several parameters of each cell and study their temporal evolution. An example of GLIM tomograms over a full cell cycle can be found in Supplementary Fig. 6. Figure 3l shows the dry mass (Supplementary Note 7; Supplementary Fig. 7) vs. volume for several different cells during a 21-h window. Each point in these plots corresponds to one cell at one time point. These results (also those from Supplementary Note 8) show that, for the most part, the points align along a straight line we found that the points deviating from this line correspond to cells going through mitosis. Meanwhile, the surface area vs. volume relation shown in Fig. 3m is essentially linear with slightly different slopes for different cells over the whole cell cycle.

### GLIM investigation of embryo viability

The Centers for Disease Control and Prevention (CDC) report from 2014 shows that 208,768 Assisted Reproduction Technology (ART) cycles were performed with 57,332 live births52. As the numbers indicate, the percentage of live births from these procedures is still rather low. One reason is the lack of objective and accurate evaluation of embryo quality and viability before transfer. Morphological assessment is currently the main method used to determine embryo viability during in vitro fertilization (IVF) cycles. However, studies have shown that the predictive power of the typical day 2 and 3 assessment of morphological parameters has remained low53,54,55. Various noninvasive analytical tools have recently been used for noninvasive prediction of embryonic potential56,57,58,59,60. One such tool has been the development of quantitative techniques for the non-invasive assessment of embryo metabolism, and its value as a predictor of embryo viability is the subject of ongoing investigations61. But currently, visual observation remains the most used and reliable method. With the improvement of microscopy, it is possible to follow embryo development in real time, and it has been established that morphokinetic parameters can be used to select embryos for higher viability62. One of the most important microscopy techniques is transmission electronic microscopy (TEM), which is considered by many the main tool for intracellular evaluation. The main problem with using TEM for embryo evaluation is that the sample preparation kills the embryo63. Therefore, although this type of microscopy can be considered an important tool for research, it has little value for routine IVF procedures. Other techniques used to evaluate the embryo quality include confocal microscopy64 and two-photon imaging65. Using these fluorescence methods, the sample must be tagged to be evaluated, which can be detrimental to embryo survival.

Due to its sectioning capabilities, GLIM can be used to perform tomography on optically thick specimens such as embryos. For this demonstration, we used bovine embryos, prepared as described in the Supplementary Methods. In a single experiment, we imaged 60 bovine embryos, starting at 12 h after fertilization, sampling every 30 min, over a 7-day period, using a 40×/0.75 NA objective. The embryo thicknesses are within 250–300 µm. Supplementary Movies 3 and 4 illustrate the high contrast that GLIM yields even in these challenging, multiple scattering samples. For example, the lipid droplets, prominent in bovine embryos can be clearly identified. Their contrast switches from dark to bright as they pass through the focus. Our results show that the embryo internal dynamics changes completely when the embryo dies. Specifically, the internal mass transport halts almost entirely, which suggest either a large increase in viscosity of the material or that the dynamic transport is mostly due to molecular motors, which stop in dead cells. GLIM can be a valuable tool for IVF because it provides an intrinsic marker to predict viability in advance. Toward this goal, we developed a dynamic index marker (DIM), based on the GLIM data. This metric is computed from the phase difference Δϕ(r ,t) and the mutual intensity γ(r , δr, t) at each time point t. To measure morphological changes, we compute the time derivative of γ, that is γ t (r ) = dγ(r , δr, t)/dt. On the basis of γ t , we calculate the spatial cumulative distribution function (CDF) of the time derivative images $${F_t}\left( x \right) = P\left\{ {{{\left| {{{\gamma '}_t}\left( {{{\bf{r}}_ \bot }} \right)} \right|}_{{r_ \bot } \in {\rm{FOV}}}} < \,x} \right\},$$ which is a probability that the amplitude of the time derivative is less than a value x. The cut-off difference (distance in x) at 10 % and 90 % of the CDF for each time point t is defined as $${D_t} = \arg {\min _{{x_1}}}\left[ {F\left( {{x_1}} \right) >0.9} \right] - \arg {\min _{{x_2}}}\left[ {F\left( {{x_2}} \right) >0.1} \right].$$ Finally, we define the DIM as the relative ratio between D t and its maximum over the imaging duration, max t (D t ), namely

$${\rm{DIM}}\left( t \right) = \left[ {{D_t}/{{\max }_{{t_1}}}\,\left( {{D_{{t_1}}}} \right)} \right].$$
(7)

Intuitively, during periods of inactivity, the spatial distribution of $${\it{\gamma }}_i^\prime$$ across the embryo is uniform, the histogram is narrow compared to periods of higher activity. We found that between the point of apparent normal, dynamic behavior and the one with a complete lack of dynamic behavior, there exists a continuous process that lasts many hours (Fig. 4a). This process is well captured by the intrinsic DIM quantity, as shown in Fig. 4b. Here, the black dotted profiles are the raw measurement of DIM as a function of time t for different embryos. We found that DIM decreases continuously over several hours until it reaches the point where the embryo motion is suppressed. Further, the DIM(t) profiles are well described by exponentially decay functions (red profiles in Fig. 4b) with time constants of 6.2 h and 12.4 h, respectively. Therefore, we anticipate that this intrinsic dynamic marker can potentially hold valuable viability prediction capability, beyond the current, morphology-based assays.

### Embryo tomography with GLIM

We obtained 3D GLIM stacks of bovine embryos at different development stages. We used a 63×/1.4 NA oil immersive objective at a transverse sampling rate of 10.8 pixel µm−1. The condenser aperture was fully opened to NAcon = 0.55 to maximize the depth sectioning and spatial resolution. The embryos were scanned in the axial dimension over an interval of (−120 µm, 120 µm) with a step of Δz = 0.05 µm. Figure 5a and b shows the xy and xz cross-sections of the raw phase gradient, ϕ. As a side note, in comparison with other QPI method, such as the Spatial Light Interference Microscopy (SLIM)29, GLIM is superior when imaging optically thick samples like embryos (Supplementary Note 9; Supplementary Fig. 8). The corresponding cross-sections of the GLIM tomogram after filtering are shown in Fig. 5c and d. More details on this filtering step can be found in the “Methods” section. The GLIM tomography, reveals various structures of the embryos, including their membranes, internal cells, gaps between the membrane of the cells, and their internal content, lipid droplets in each cell, as indicated in Fig. 5e. The xz cross-sections further show the contact between the embryo and the underlying glass substrate (Fig. 5d), along with debris on the substrate.

Figure 5f–h shows the rendering results of three different embryos, consisting of two cells, four cells, and five cells, as indicated. One can see clearly how different cells of the embryo stack with respect to each other in 3D. The membranes of the embryos are manually segmented and displayed as transparent surfaces. Rendering videos of these embryos can be found in the Supplementary Movies 58.

## Discussion

In summary, we introduced GLIM, as a new QPI method, for 3D imaging of unlabeled specimens. GLIM has all the benefits of common-path white-light methods including nanometer path-length stability, speckle-free, and diffraction-limited resolution. At the smallest condenser aperture, GLIM gives exact values of the quantitative phase for thin samples. At the largest condenser aperture, GLIM can be used as a tomography method, allowing us to obtain time-lapse 3D information of thick samples. We demonstrated the success of GLIM on various samples, from beads, HeLa cells, to bovine embryos. We believe that this method will set an excellent foundation for other research projects and applications.

As a label-free method, GLIM can be applied to imaging live cells and thick samples nondestructively over broad temporal and spatial scales. This technique is not limited by photobleaching and phototoxicity commonly associated with fluorescence microscopy. Also, it provides excellent optical sectioning and obtains 3D information from unlabeled specimens. However, similar to other label-free images, GLIM lacks specificity. Therefore, we envision that GLIM and fluorescence techniques will co-exist and corroborate the advantages of specificity and noninvasiveness. This is completely feasible since GLIM operates on the same optical path as the fluorescence channels, allowing a seamless transition between the two modalities.

## Methods

### GLIM optical setup

The GLIM add-on module is mounted to the output camera port of a conventional DIC microscope. The measurements used in Fig. 1b were acquired with an Olympus IX70 microscope equipped with a 20×/0.65 NA objective. Subsequent measurements were conducted using an Axio Observer Z1 microscope with incubation system (Zeiss) under ×40/0.75 NA (420361-9910-000) and 63×/1.4 NA (420781-9910-0000) objectives. The GLIM module contains a polarizer before the camera and the SLM (Meadowlark) is positioned to match the DIC shear angle (45°). To ensure optimal modulation, the SLM was illuminated with filtered white light using a fluorescent emission filter with a central wavelength of λ o = 624 nm and a bandwidth of Δλ = 43 nm.

### Phase-gradient extraction from intensity images

The intensity image at modulation ϕ m is given by

$${I_n}\left( {\bf{r}} \right) = I\left( {\bf{r}} \right) + I\left( {{\bf{r}} + \delta {\bf{r}}} \right) + 2\left| {\gamma \left( {{\bf{r}},\delta {\bf{r}}} \right)} \right|\cos \left[ {\Delta \phi \left( {\bf{r}} \right) + {\phi _n}} \right],$$
(8)

where Δϕ = ϕ(r + δ r)−ϕ(r) ≈ (ϕ)δr, the phase difference of interest, and ϕ the gradient of the phase in the direction of the shift. The spatial shift δ r is the transverse displacement introduced by the DIC prism, estimated experimentally from measurements of the test samples. The quantity γ(r, δ r) is the mutual intensity, in other words, the temporal cross-correlation function between these two fields, evaluated at zero delay, γ(r, δ r) = 〈U *(r)U(r + δ r)〉 t . Combining the four intensity frames, we obtain the phase gradient as

$$\nabla \phi \left( {\bf{r}} \right) = \arg \left\{ {\left[ {{I_4}\left( {\bf{r}} \right) - {I_2}\left( {\bf{r}} \right)} \right],\left[ {{I_3}\left( {\bf{r}} \right) - {I_1}\left( {\bf{r}} \right)} \right]} \right\}/\delta r.$$
(9)

### DIC shear estimation

The lateral offset between the two DIC beams, δ r, relates the phase difference image, Δϕ, linearly to a quantitative phase gradient, ϕ, via the following relation Δϕ = ϕδr. Although this parameter is known to the microscope manufacturer (Zeiss, Olympus, etc.), to the best of our knowledge, it is not publicly listed. To estimate the spacing between the two beams, we matched the associated phase shift to a known calibration sample. In our procedure, we acquired a fine tomographic stack of a small object (300 nm bead), and performed line integration in the direction of the DIC gradient. As expected, the phase was always maximized at the plane of best focus. The peak of the integrated phase corresponds to the theoretical maximum phase shift due to the control structure, namely φ * = (2πd Δn/λ o), where Δn is the refractive index mismatch between the sample and the background, λ o is the central wavelength, and d is the diameter of the bead. However, given a pixel dimension of p (µm), we also have another relationship $${\varphi ^*} = {\int}_{{x_{\rm{o}}}}^{{x^*}} {{\nabla _x}\varphi {\rm{d}}x} \approx {\int}_{{x_{\rm{o}}}}^{{x^*}} {\Delta \varphi {\rm{d}}x/\delta r} \approx p\mathop {\sum}\nolimits_{k = {k_{\rm{o}}}}^{{k^*}} {\Delta \varphi \left[ k \right]/\delta r} ,$$ where x o, x * are locations of the background and somewhere inside the control structure. These locations correspond to pixel indices of k o and k *, respectively. Combining these two relations yields

$$\delta r = \frac{{p{\lambda _{\rm{o}}}\mathop {\sum}\limits_{k = {k_{\rm{o}}}}^{{k^*}} {\Delta \varphi \left[ k \right]} }}{{2\pi d\Delta n}}.$$
(10)

Using this formula, we estimated that the DIC prism shifts are 175 nm for 63×/1.4 NA and 345 nm for 40×/ 0.75 NA sliders. These values are in good agreement with reported literature66.

### 2D real-time interferometric reconstruction

The 2D image formation model in GLIM is shown in Supplementary Note 3, where we relate the measured phase difference Δϕ with the sample transmission, T. To fully automate the data acquisition for Δϕ, we developed a software platform capable of mechanical automation and real-time phase retrieval. Our image acquisition platform is designed to overlap the GLIM computation with the operation of the camera, SLM, and microscope. The software is developed in C++ using the Qt framework. The real-time reconstruction runs on three threads with the first thread responsible for triggering new camera frames and modulating the SLM. The second thread receives incoming images and transfers them to the graphics card. The third thread is used to display the GUI and render the resulting phase maps (Supplementary Fig. 2a). As we decouple the triggering thread from the data transfer and computation threads, variability in camera transfer rates does not slow down the acquisition.

To remove the phase slant found in commercial DIC microscopes (see ref. 42), we perform Fourier bandpass filtering. Here, we construct a filter function that does bandpass filtering as

$$F\left( {{{\bf{k}}_ \bot }} \right) = \left\{ {1 - \exp \left[ { - k_ \bot ^2/\left( {2\rho _{hi}^2} \right)} \right]} \right\}\exp \left[ { - k_ \bot ^2/\left( {2\rho _{lo}^2} \right)} \right],$$
(11)

where ρ hi and ρ hi define the bandwidth of the high-pass and low-pass filtering operations, respectively. Their values also depend on the system magnification. This operation eliminates the slow-varying oscillation in the GLIM images. The results of this operation are shown in the Supplementary Fig. 2b. To improve the SNR of the reconstruction, we match the mean intensities of the four frames by proportionally adjusting the exposure time. For example, the exposure time is eight times longer for the extinction frame (π modulation) compared to maximum brightness frame (0π modulation). The longer exposure is later compensated numerically in extracting the phase (Supplementary Fig. 2c). This operation results in an increased signal-to-noise ratio (SNR), thanks to a reduction in the phase noise deviation (Supplementary Fig. 2c).

Our GLIM system operates at 10 phase images per second with a rendering rate at 40 frames per second. As the computational portion is overlapped with acquisition, the rate-limiting factor in our system is the exposure time. Thus, longer exposure can be replaced by illumination with a brighter source. After acquiring individual images from different FOVs, we combine them together, forming a large mosaic to study large-scale dynamics. See Supplementary Note 4 and Supplementary Fig. 3 for implementation details on the image alignment and registration algorithm.

### Filtering method to improve depth sectioning of GLIM

To improve the optical sectioning, and push GLIM into a 3D imaging method, we removed the low-frequency (out-of-focus) components from our data using a high-pass filter. Steps of our methods are summarized in the Supplementary Fig. 9. First, we removed the DIC shading artifact using Wiener deconvolution67. The 3D point spread function of the system is given as $$h\left( {\bf{r}} \right) = {{\rm{Im}}} \left[ {\left( {{\mu _i}{g^*}} \right)\left( {\bf{r}} \right) - \left( {{\mu _i}{g^*}} \right)\left( {{\bf{r}} - {{\hat {\bf x}}}\delta x} \right)} \right],$$ (Supplementary Note 10; Supplementary Fig. 10), the transfer function is $$\tilde h\left( {\bf{k}} \right) = 2i\sin \left( {{k_x}\delta x/2} \right)F\,\left\{ {{{\rm{Im}}} \left( {{\mu _i}{g^*}} \right)} \right\}\left( {\bf{k}} \right)$$. As a side note, a measurement of the point spread function using a microbead can be found in the Supplementary Fig. 11. The Wiener deconvolution result of the susceptibility can be obtained in the frequency domain as

$${\tilde \chi _{{\rm{weiner}}}}\left( {\bf{k}} \right) = \frac{{ - 2i\sin \left( {{k_x}\delta x/2} \right)\delta x}}{{\beta _{\rm{o}}^2\left[ {4{{\sin }^2}\left( {{k_x}\delta x/2} \right) + \varepsilon } \right]}}{\rm F}\,\left[ {{\nabla _x}\phi } \right]\left( {\bf{k}} \right),$$
(12)

where ε is a small number, set to be 10−4 to avoid amplifying frequency components with small SNRs. To further improve the axial resolution, it is necessary to significantly suppress the low-frequency components in χ weiner(r). We achieve this by applying high-pass filtering in the xy domain for each recorded z-image. In each dimension (x and y), a convolution with a finite-length impulse response (FIR), chosen as h hp (x) = (0.25, −0.25, 0, −0.25, 0.25), is applied. The result of this high-pass filtering, χ weiner, hp(r), (Supplementary Fig. 9b) has most of the small transverse frequencies suppressed and, as a result, yields very good depth sectioning. Note that this high-pass filtering step can be combined with the Wiener deconvolution step since both are linear operators. Also, there is no need to perform any z-processing in our proposed method. This allows the processing to interlace with image acquisition. After filtering, we applied a log-compression transform to increase the contrast of the retained high-frequency components in the output image and normalized so that the largest signal is 0 dB. To reject the background noise, we keep signals with amplitude larger than −100 dB. Finally, to smooth the image and remove high-frequency oscillations in the image, we further apply bilateral filtering68 on the transformed results to obtain χ weiner, bf(r ,z) (Supplementary Fig. 9c). There are two passes of 2D bilateral filtering. In the first pass, bilateral filtering is applied to each 2D Wiener deconvolved image χ weiner (r ,z) for each value of z. The second pass applies bilateral filtering on the stacked result of the first pass for each 2D image of the same lateral coordinate x. Owing to the similarity, we describe here the first one. For each value z, the bilateral results χ weiner,bf (r ,z) is obtained from χ weiner (r ,z) using

$${\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{\bf{r}}_ \bot },z} \right) = \hskip17pc\\ \frac{{{\int\!\!\!\int} {{{\rm d}^2}{{{\bf{r}}'}_ \bot }{c_{\rm{r}}}\left( {{{\bf{r}}_ \bot },{{{\bf{r}}'}_ \bot }} \right){c_{\rm{s}}}\left[ {{\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{\bf{r}}_ \bot },z} \right),{\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{{\bf{r}}'}_ \bot },z} \right)} \right]{\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{{\bf{r}}'}_ \bot },z} \right)} }}{{{\int\!\!\!\int} {{{\rm d}^2}{{{\bf{r}}''}_ \bot }{c_{\rm{r}}}\left( {{{\bf{r}}_ \bot },{{{\bf{r}}''}_ \bot }} \right){c_{\rm{s}}}\left[ {{\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{\bf{r}}_ \bot },z} \right),{\chi _{{\rm{weiner}},{\rm{bf}}}}\left( {{{{\bf{r}}''}_ \bot },z} \right)} \right]} }},$$
(13)

where the radially symmetric Gaussian functions were used for the closeness function $${c_{\rm{r}}}\left( {{{\bf{r}}_ \bot },{{{\bf{r}}'}_ \bot }} \right) = \exp \left[ { - 0.5\left\| {{{\bf{r}}_ \bot } - {{{\bf{r}}'} \bot }} \right\|_2^2/\sigma _{\rm{r}}^2} \right]$$ and $${c_{\rm{s}}}\left( {\chi ,\chi '} \right) = \exp \left[ { - 0.5\left\| {\chi - \chi '} \right\|_2^2/\sigma _{\rm{s}}^2} \right].$$ Here, $${\left\| . \right\|_2}$$ denotes the l 2− norm. The coefficients, σ r, σ s are chosen to determine the amount of filtering. In our case, they are set to σ r = 2.2μm and σ s = 3% of the maximum value of the input data χ weiner,bf (r ,z), respectively. Clearly, the output of the post-processing has better depth sectioning compared to the input image. Different structures and materials, which are not visible in the raw input, appear nicely in the output.

### Data availability

The data that support the findings in this paper are available upon request.