Introduction

Improving the imaging depth of high-resolution optical microscopy has been a long-standing goal in the field of bioimaging due to its potential impact on biological studies and optical diagnostics1. For ideal diffraction-limited imaging, the main strategy is to detect the so-called ballistic wave that propagates straight through a scattering medium and carries intact object information. However, this ballistic wave is quickly obscured by multiply scattered waves even at a shallow depth as its intensity decays exponentially with distance traveled in a scattering medium due to multiple light scattering. To extend the imaging depth, the prevailing approach so far has been to filter out the multiply scattered waves by applying various gating operations, such as confocal2,3, time (or coherence)4,5,6,7, and polarization gating8,9. For example, optical coherence tomography, one of the most successful biomedical imaging modalities, greatly extends imaging depth by combining all these gating operations7,10,11. Similarly, spatial correlation within a time-gated transmission or reflection matrix has recently been used to selectively extract image information12,13. Furthermore, various adaptive optics approaches have been proposed to maintain the effectiveness of gating operations in spite of sample-induced aberration14,15,16,17.

Even with these substantial advances, the imaging depth of high-resolution optical microscopy has not yet reached the detection limit set by the dynamic range of state-of-the-art sensor technology. The ballistic wave is, in principle, detectable even at depths >15 ls in an epi-detection geometry (where ls is the scattering mean free path of the scattering medium) if an image sensor of high dynamic range (e.g., 1:104) is used in conjunction with interferometric detection converting an intensity recording into an electric field measurement10,12,13,18,19. Currently, the imaging depth limit is instead set by the competition between the ballistic wave and the multiply scattered wave that bypasses the existing gating operations. The residual multiply scattered wave can be significantly stronger than the ballistic wave well before reaching the detection limit10,19,20,21. For instance, the chance that a multiply scattered wave has a similar flight time to a ballistic wave and passes through a time gating of finite width increases with imaging depth. Likewise, a large fraction of a multiply scattered wave can pass through a confocal pinhole under conditions of extreme turbidity, and thereby be mistakenly considered as a ballistic wave. In fact, these imperfections of the existing gating methods are partly due to their action being at a detection plane, which is located outside the scattering medium. To reach the detection limit, it is critical to develop an additional gating method whose mechanism is independent of the existing methods and yet effective enough to complement them.

Here, we propose a new gating scheme called space gating. Based on the interferometric detection scheme of previous acousto-optic imaging techniques22,23,24,25,26,27,28, we implement the space gating by selectively measuring the ballistic wave that is modulated by a high-frequency ultrasound focus as small as ~30 µm × 70 µm in size. Unlike confocal or time gating, space gating is directly applied at the object plane inside the scattering medium to reject the multiply scattered wave whose optical path spreads beyond the extent of the ultrasound focus. Therefore, it can remove the multiply scattered wave, which cannot be filtered out by the existing gating operations. Integrating the space gating into the coherent confocal microscopy, we demonstrate imaging of amplitude objects through scattering layers thicker than 23ls with the optical diffraction-limited resolution of 1.5 µm. Furthermore, by combining the noise rejection capability of space gating with the advantage of coherent treatment of the ballistic wave, we demonstrate the quantitative phase imaging of biological cells fully embedded within a scattering medium. Lastly, we examine the effectiveness of space gating in imaging skeletal muscle structures of an unstained zebrafish across its entire body. The proposed concept of space gating is an independent and complementary addition to the existing gating operations. It represents an important step toward reaching the fundamental depth limit of diffraction-limited imaging relying on ballistic waves, and opens new possibilities for label-free imaging of biological cells through scattering tissues.

Results

Principles

The concept of space gating combined with confocal gating is illustrated in Fig. 1a. To implement the confocal gating, we illuminated an object plane with a focused laser beam and detected the transmitted field at the position rd conjugate to the illumination point ri. As shown in Fig. 1a, ri and rd are the illumination and detection points defined on the planes conjugate to the object plane. We measured the signal only at the sensor pixels (marked with a blue square) conjugate to the focused illumination. This ensures that only the ballistic wave (indicated as green lines in Fig. 1a), which carries the optical diffraction-limited image, contributes to the measured field in the absence of scattering. This scheme is equivalent to the conventional confocal gating, where a physical pinhole is used.

Fig. 1: Principle of space gating.
figure 1

a Schematic of the imaging principle. Conventional confocal imaging method relies on the ballistic waves shown as green lines. When optical inhomogeneity is introduced, the intensity of the ballistic wave exponentially decreases with depth, and the multiply scattered wave (shown as solid blue and dotted blue lines) may obscure the ballistic wave. By implementing space gating at the object plane using an acousto-optic effect (indicated as a red spot), the multiply scattered wave that travels outside the acoustic focus (dotted blue lines) can be rejected, which in turn improves the ratio of the ballistic wave to the multiply scattered wave at the sensor element (marked as a blue pixel), whose position is conjugate to the illumination point rd ~ ri. b Intensity maps of illumination and detection transfer functions in a confocal gating scheme (where rd ~ ri), with respect to ro on the object plane for a transparent medium. c Same as b, but in the presence of scattering. The optical thicknesses on the illumination and detection sides were Li/ls = 10.6 and Ld/ls = 12.8, respectively. d Contribution map, |Ti(ro; ri)Td(ro; rd)|2, with respect to ro calculated from the transfer functions in b. e Same as d, but in the presence of scattering, calculated from c. Scale bar: 100 μm.

In the presence of scattering, the transmitted field E(rd; ri) measured at the detection plane is composed of two components: ballistic signal field ES(rd; ri) and multiply scattered noise field EM(rd; ri) (i.e., E(rd; ri) = ES(rd; ri) + EM(rd; ri)). In deep tissue imaging, the multiply scattered wave often obscures the ballistic wave and limits the imaging depth of diffraction-limited imaging. The space gating aims to selectively suppress the multiply scattered wave based on the fact that it is spatially spread over the wide extent on the object plane (as indicated as blue lines in Fig. 1a), in contrast to the ballistic wave which is tightly confined at the confocal point. The space gating is implemented by setting a spatial window RSG (indicated by the red spot in Fig. 1a) around the confocal point on the object plane in such a way that only the wave transmitted through the gating window contributes to the detected field. This operation selectively rejects the multiply scattered wave traveling outside the spatial window (indicated as the blue dotted lines in Fig. 1a), while leaving the ballistic wave unaffected.

The effect of the space gating can be quantitatively understood by the transfer functions Ti(ro; ri) and Td(ro; rd) describing the optical propagation through the illumination and detection parts of the scattering medium, respectively:

$$T_{\mathrm{i}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right) = S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right)e^{ - \frac{{L_{\mathrm{i}}}}{{2l_{\mathrm{s}}}}} + M_{\mathrm{i}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right)$$
(1)
$$T_{\mathrm{d}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right) = S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right)e^{ - \frac{{L_{\mathrm{d}}}}{{2l_{\mathrm{s}}}}} + M_{\mathrm{d}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right).$$
(2)

The subscripts i and d indicate the illumination and detection parts of the sample, as indicated in Fig. 1a. Ti(ro; ri) is the complex field amplitude at ro on the object plane for the illumination of a unity amplitude field originated from ri, and Td(ro; rd) is defined likewise for the detection part. L and ls are the thickness and scattering mean free path of the sample, respectively, and S(ro; ri) and S(ro; rd) denote the transfer functions of ballistic waves, which are the intrinsic point spread functions (PSFs) of the optical system (Fig. 1b). For simplicity, we assume unity magnification from the planes of ri and rd to the object plane. Mi(ro; ri) and Md(ro; rd) denote the transfer functions of multiply scattered waves, which extend over a wide area on the object plane (Fig. 1c).

The transmitted field E(rd; ri) without the space gating can be described with the two transfer functions:

$$E\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right) = \mathop {\int}\nolimits_{\!\!\!\!R} {T_{\mathrm{i}}} \left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right)T_{\mathrm{d}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right)d{\mathbf{r}}_{\mathrm{o}},$$
(3)

where R covers the entire object plane. This equation is subject to the assumption that the multiple scattering between the illumination and detection parts of the scattering medium is negligible, which is largely the case for the highly anisotropic scattering medium. Note that the multiple scattering within the illumination and detection parts of the scattering medium is already accounted for in Ti(ro; ri) and Td(ro; rd). By inserting Eqs. (1) and (2) into Eq. (3), we obtain the signal field and the noise field as follows:

$$E_{\mathrm{S}}\left({{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}}\right) = \mathop{\int}\nolimits_{\!\!\!\!R} {S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}}\right)S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right)e^{ - \frac{{\left( {L_{\mathrm{i}} + L_{\mathrm{d}}} \right)}}{{2l_{\mathrm{s}}}}}d{\mathbf{r}}_{\mathrm{o}}}$$
(4)
$$E_{\mathrm{M}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right) = \,\mathop {\int }\nolimits_{\!\!\!\!R}\left[ S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right)e^{ - \frac{{L_{\mathrm{i}}}}{{2l_{\mathrm{s}}}}}M_{\mathrm{d}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right) + S\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right)e^{ - \frac{{L_{\mathrm{d}}}}{{2l_{\mathrm{s}}}}}M_{\mathrm{i}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right) \right. \\ \qquad \left. + \, M_{\mathrm{i}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{i}}} \right)M_{\mathrm{d}}\left( {{\mathbf{r}}_{\mathrm{o}};{\mathbf{r}}_{\mathrm{d}}} \right)\vphantom{{\sum}_{x}} \right] d{\mathbf{r}}_{\mathrm{o}}.$$
(5)

See Supplementary Note 1 for the discussion on the relative magnitude among the signal field and the three terms in the noise field. The multiplication of S(ro; ri) and S(ro; rd) in the signal field of Eq. (4) describes the confocal action. The multiplication of the two transfer functions Ti(ro; ri)Td(ro; rd), shown in Fig. 1d, e, describes how much each point ro on the object plane contributes to the light propagation from the illumination point ri to the detection point rd. Mathematically, the space gating sets the spatial window RSG around the confocal point, as indicated by the white dotted lines in Fig. 1d, e. Therefore, the measured field with the space gating \(E^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right) = E_{\mathrm{S}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right) + E_{\mathrm{M}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right)\) can be derived by Eqs. (3), (4), and (5) after reducing the integration range from R to RSG. Because the signal field of Eq. (4) and the first two terms in the noise field of Eq. (5) involve the ballistic propagation confined to the confocal point, only the Mi(ro; ri)Md(ro; rd) term in the noise field is reduced by the space gating. Considering that this term is dominant in determining the noise field (see Supplementary Note 1 for further explanation), the noise suppression factor η can be estimated as:

$$\eta = \left| {E_{\mathrm{M}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2/\left| {E_{\mathrm{M}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2\sim {\mathrm{min}}\left( {w_{{\mathrm{M}}_{\mathrm{i}}},w_{{\mathrm{M}}_{\mathrm{d}}}} \right)^2/w_{{\mathrm{SG}}}^2.$$
(6)

Here, wMi and wMd are the effective widths of |Mi(ro; ri)|2 and |Md(ro; rd)|2, respectively. wSG is the width of RSG set by the acoustic focus size in our experiment. For biological tissues, wMi and wMd typically range from hundreds of microns to millimeters when L/ls ~ 10 (see Supplementary Fig. 1 for the detailed analysis on wMi and wMd). Therefore, we can expect η > 100 if the size of the space gating wSG is as small as tens of microns, as is the case with a high-frequency acoustic focus.

To see the effect of space gating on the imaging depth of the optical diffraction-limited imaging, we define the imaging fidelity by the contrast of ballistic wave: τ = |ES(rd; ri)|2/|EM(rd; ri)|2 without space gating, and \(\tau ^{{\mathrm{SG}}} = \left| {E_{\mathrm{S}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2/\left| {E_{\mathrm{M}}^{\mathrm{SG}}\left( {{\mathbf{r}}_{\mathrm{d}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2\) with space gating. When the imaging fidelity is sufficiently >1, the ideal optical diffraction-limited imaging is achieved as the detected field is mostly comprised of the ballistic wave. When increasing the imaging depth, the spatial resolution remains close to the optical diffraction limit of the confocal imaging system, while the contrast of the ballistic wave is reduced due to the exponential decay of the ballistic wave. The space gating improves the imaging fidelity by a factor of η, i.e., τSG = η × τ. Considering the exponential decay of the intensity of ballistic wave, the imaging depth increases logarithmically with η. More specifically, the noise suppression effect can compensate the additional decay of ballistic wave by the increased imaging depth, i.e., \(\eta \times e^{ - \Delta L/l_s} = 1\), where ΔL is the gain in the imaging depth by the space gating. Therefore, η is translated into \({\mathrm{\Delta }}L = l_{\mathrm{s}} \times \log \eta\). For η > 100, we can expect the gain in imaging depth ΔL of >5ls. We provide further analysis on the relation of the imaging depth to the size of the acoustic focus and the optical wavelength in the Supplementary Note 2.

Confocal imaging setup with acousto-optic space gating

Figure 2a shows the experimental configuration of the confocal imaging system integrated with a high-frequency acousto-optic space gating (see Methods for details of the setup). Our scheme of space gating is based on an interferometric detection method similar to the previously demonstrated ultrasound-modulated optical tomography24,25,27,28,29,30. When light propagates through the oscillating pressure field at the acoustic focus, a fraction of the light is modulated by the frequency of fUS. Then, the reference beam with the frequency of f0 + fUS form a static interference pattern with the modulated wave. The complex field of the modulated wave is then selectively measured using four-step phase-shifting interferometry31. See Supplementary Figs. 2 and 3 for the detailed experimental setup, and the electrical signal flow for the acousto-optic measurements, respectively.

Fig. 2: Confocal imaging setup with acousto-optic space gating.
figure 2

a Focused acoustic beam modulates the frequency of the incident focused illumination beam, whose optical frequency is f0. Only the frequency-modulated wave through the region of the space gating (i.e., acoustic focus) is measured at the sensor plane by using a phase-shifting interferometry, where the frequency of the reference beam is set to that of the acoustically modulated optical wave f0 + fUS. b Average intensity map for 900 planar illuminations at different incidence angles through a transparent medium without space gating. The entire object plane contributes to the detected signal. The intensity map is normalized to the mean intensity. c Same as b, but with space gating. With the space gating, only the region inside the gating window (i.e., the acoustic focus) contributes to the detected signal. The intensity map was normalized such that it represents the acoustic modulation efficiency. Scale bar: 30 μm. dg Point spread functions (PSFs) |E(rd; ri)|2 measured on the detection plane without space gating, when the optical thicknesses of the input and output layers were (0, 0), (6.9, 10.6), (6.9, 12.8), and (10.6, 12.8), respectively. hk PSFs |ESG(rd; ri)|2 with space gating for the corresponding scattering layers to dg. PSFs were normalized to their maximum intensities. Scale bar: 5 µm.

We first confirmed the spatial extent of the acousto-optic space gating (see Methods for measurement details). Without the space gating, the intensity map was uniform across the field of view (Fig. 2b). When the space gating was applied, only the wave traveling through the acousto-optic gating window, RSG, was visible (Fig. 2c). The full-width at half maximum (FWHM) of RSG were 29 and 72 µm along the x- and y-axes, respectively. Because the acoustic impedance of fat, water, blood, and muscle does not differ >10%, soft tissues may be considered homogenous for the acoustic wave. This guarantees the diffraction-limited confinement of acoustic focus. In the case of hard tissues, such as bone and teeth, they block the propagation of acoustic wave as their acoustic impedances are five times as large as that of soft tissues. This is a common limitation for all the ultrasound-based applications in biology and medicine. The gating contrast, measured by the ratio between the average intensity inside the blue box and outside the orange box in Fig. 2c, was ~100, and the modulation efficiency around the focal area (see Methods for calculation details) was 5%, which has been optimized through the precise synchronization between the laser pulses and acoustic pulses. Note that this acousto-optic modulation efficiency does not affect the signal to noise ratio τSG or the noise suppression factor η because both the signal and noise are subject to the same modulation efficiency.

Figure 2d–k presents the PSFs |E(rd; ri)|2 and |ESG(rd; ri)|2 without and with space gating, respectively, measured at the detector plane. Figure 2d, h is the intrinsic system PSFs through a transparent medium composed of polyacrylamide (PAA) gel. The FWHM of the foci, which dictate the imaging resolution, were measured to be 1.5 µm with or without space gating. In Fig. 2e–g, i–k, we introduced an optical inhomogeneity using scattering poly(dimethylsiloxane) (PDMS) layers on the input and output surfaces of a sample cuvette (see Methods section for details of sample preparation). The distance between the input/output surfaces to the object plane was ~4 mm. The optical thicknesses of the input and output layers (Li/ls, Ld/ls) were respectively (6.9, 10.6), (6.9, 12.8), and (10.6, 12.8) for each of the Fig. 2e–g, i–k. The ballistic wave appeared as a peak at the detection point rd conjugate to the illumination point ri, and the fluctuating background of the multiply scattered wave spread across the detector plane. The imaging fidelity τ and τSG were experimentally determined as the averaged intensity ratio of the peak to the fluctuating background (see Methods section for details). For instance, for the case of (Li/ls, Ld/ls) ~ (10.6, 12.8), τ was ~0.1 while τSG was ~30, from which we can expect only space-gated imaging to properly provide a diffraction-limited resolution in this scattering regime.

Amplitude imaging through a scattering medium

We first performed space-gated imaging of amplitude objects through scattering layers of various thicknesses (Fig. 3; see Methods section for details of the objects and the imaging procedure). The reconstructed images without and with space gating are shown in Fig. 3a–h, respectively. The optical thicknesses of the scattering layers (Li/ls, Ld/ls) were (0, 0), (6.9, 10.6), (6.9, 12.8), and (10.6, 12.8) (the same configuration as for the PSF measurements in Fig. 2d–k). In a relatively weak scattering regime (Fig. 3a, b, e, f), both methods yielded images of the amplitude objects, although there was considerable background fluctuation in the conventional confocal imaging. When scattering became stronger (Fig. 3c, d, g, h), only the space-gated confocal imaging provided resolved images of the amplitude objects. It is remarkable that, with the aid of space gating, the objects could be clearly resolved even in the highly scattering regime of (Li + Ld)/ls > 23, while the conventional method presented only randomly fluctuating noise dominated by the multiply scattered wave. The imaging results are in good agreement with the PSFs measured in Fig. 2d–k, in the sense that well-resolved images were reconstructed only when the spot contrast (τ and τSG) was sufficiently larger than unity. Interestingly, for the case of (Li/ls, Ld/ls) ~ (6.9, 12.8), the reconstructed image (Fig. 3c) was significantly degraded even though the ballistic spot was distinctively visible (i.e., τ = 9.1 > 1 as shown in Fig. 2f). This is because EM(rd; ri) of relatively small amplitude can cause a large fluctuation in |ES(rd; ri) + EM(rd; ri)|2 depending on the relative phase of EM(rd; ri) to ES(rd; ri) (see Supplementary Fig. 4 for the quantitative analysis about this effect).

Fig. 3: Demonstration of space gating in confocal imaging.
figure 3

Images were reconstructed by scanning 900 points within a 16.1 × 16.1 µm2 field of view. ad Reconstructed intensity images of 2-μm gold-coated microspheres without space gating when the optical thicknesses of the input and output layers were (0, 0), (6.9, 10.6), (6.9, 12.8), and (10.6, 12.8), respectively. eh Reconstructed images with space gating for the same configurations as ad. Images were normalized to their maximum intensities. Scale bar: 2 µm.

Noise suppression factor achieved by space gating

To elucidate how the effect of space gating varied depending on the degree of scattering, we estimated τ and τSG over a wide range of total optical thickness, Ltot/ls with Ltot = Li + Ld (see Supplementary Fig. 5 for the measured PSFs of |E(rd; ri)|2 and |ESG(rd; ri)|2 for all combinations of input and output layers). We fixed Li/ls to 6.9, 10.0, and 10.6 and varied Ld/ls for each case of input layer. For convenience of analysis, we set Li/ls < Ld/ls in our experiments, so that wMi < wMd for all cases. In Fig. 4a, three lines with different markers show τ and τSG for the three cases of Li/ls. In every case, τSG lies well above τ, proving the effectiveness of space gating. τ monotonically decreased with Ltot/ls, and its behavior was generally dictated by the exponential decay of the ballistic wave because the decay of the multiply scattered wave was much slower. On the contrary, τSG was highly dependent on Li/ls. For instance, at Ltot/ls = 21, τSG was 36 for Li/ls = 6.9 and 240 for Li/ls = 10.0. This supports our theoretical prediction in Eq. (6) that the effect of space gating is mainly determined by wMi(<wMd), which was set by Li/ls, rather than Ltot/ls.

Fig. 4: Noise suppression efficiency of space gating.
figure 4

a Ratio of the ballistic wave to the multiply scattered wave with space gating (τSG, blue) and without space gating (τ, red) as a function of the total optical thickness, Ltot/ls. Circular, triangular, and rectangular markers indicate cases of input layer optical thicknesses, Li/ls, fixed to 6.9, 10.0, and 10.6, respectively. The optical thickness of the output layer was also varied for each case. b Noise suppression factor η of space gating. η was obtained from τSG/τ in a.

Figure 4b presents η, obtained from τSG/τ. η ranged from 4.4 to 150 depending on the combination of the input and output scattering layers. As predicted in Eq. (6), η was largely determined by Li/ls, and Ld/ls had a marginal impact on η as Li/ls < Ld/ls. At Li/ls = 6.9, η was in the range of 4–11, while it increased to 47–150 when Li/ls was increased to 10 or 10.6. Similar to τSG, even for the same Ltot/ls (e.g., at Ltot/ls = 21 in our experiments), η can vary significantly, implying that the effect of space gating is highly dependent on the axial position of the object plane within a homogeneously scattering medium. The maximum noise suppression factor we observed was η = 150 for the configuration of (Li/ls, Ld/ls) = (10.6, 13.9). See Supplementary Fig. 6 for the transfer functions, Ti(ro; ri) and Td(ro; rd), of all scattering layers, and Supplementary Fig. 7 for τ, τSG, and η that were estimated from Ti(ro; ri) and Td(ro; rd) based on the model presented in the Principles section.

Coherent imaging of objects embedded inside a turbid medium

Here, instead of having the gap between the scattering layer and the object plane, such as in previous studies32,33 and our proof-of-concept experiments in Figs. 2 and 3, we considered the fully embedded amplitude objects within a scattering medium (Fig. 5a) and performed space-gated imaging to verify that our imaging scheme is robust against the small speckle grains inside an acoustic focus. .We confirmed that, at the object plane, the width of speckle grain was 280 nm, which is about half the wavelength (see Methods section for details of sample preparation and determination of speckle size). While the image was completely scrambled by multiple scattering without space gating (Fig. 5b), the object was clearly resolved with space gating (Fig. 5c). τSG and τ were estimated to be 260 and 1.1 from the PSFs (Supplementary Fig. 8), leading to a noise suppression factor η of 240.

Fig. 5: Coherent imaging of objects fully embedded within a scattering medium.
figure 5

a Schematic of the sample configuration. The bottom left inset shows the speckle pattern measured right at the object plane after removing the right hand side of the scattering medium (inset scale bar: 1 μm), and the bottom right inset shows a photograph of the scattering medium. The optical thickness of the scattering slab was 21.0. b, c Reconstructed images of 2-μm gold-coated microspheres embedded within the scattering medium without and with space gating, respectively. With the noise suppression factor η = 240 by the space gating, the gold-coated microspheres were clearly resolved. Images were normalized to their maximum intensities. Scale bar: 2 μm. d, e Reconstructed phase images of human red blood cells embedded within the same scattering medium used in b and c without and with space gating, respectively. With space gating, the size and the morphology of the red blood cells can be obtained from the phase map. Scale bar: 5 μm.

By leveraging the noise suppression capability of space gating with the coherent treatment of a ballistic wave, we demonstrate the unique capability of space gating—the quantitative phase imaging of human red blood cells completely embedded within a scattering medium (see Methods section for details of sample preparation). As shown in Fig. 5d, only the speckled phase pattern was visible without space gating due to the dominance of the multiply scattered wave over the ballistic wave. In contrast, our method revealed the phase delay associated with the morphology of the red blood cells embedded within the scattering medium (Fig. 5e). To our knowledge, this is the first experimental demonstration of the quantitative phase imaging of biological cells embedded within such a thick scattering medium. This opens a new venue for interrogating transparent biological cells within small animals or organs, with no use of exogenous contrast agents.

Demonstration of deep imaging within a 30-dpf zebrafish

To prove the effectiveness of space gating in the context of imaging within intact biological tissues, we performed imaging of whole-body zebrafish at 30 days post fertilization (dpf) and intentionally chose the imaging plane behind the spinal cord to demonstrate the capability of space gating in a more realistic situation, where the complex structures, such as skin, bone, muscle, and organs are heterogeneously distributed between the imaging plane and the imaging objective lens. We note that high-resolution fluorescence imaging for whole-body studies is restricted to young zebrafish of a few days after fertilization due to its shallow imaging depth34,35,36. The 30-dpf zebrafish was ~560-µm thick within the transverse section across the head-trunk region, and the imaging plane was placed 180 µm behind the spinal cord as depicted in Fig. 6a. Therefore, the imaging depth was 460 µm from the surface of zebrafish. In this region, three important structural features of skeletal muscle fibers manifest in a conventional hematoxylin and eosin (H&E) stained histological section: the myosepta that separate and support the blocks of muscle fibers (indicated as the dotted yellow lines in Fig. 6b), the obliquely arranged muscle fibers in between the myosepta (indicated with the dotted white arrow in Fig. 6b), and the alternating light and dark bands (i.e., sarcomere), called I-bands and A-bands, along the muscle fibers (i.e., along the dotted white arrow in Fig. 6b).

Fig. 6: Demonstration of space-gated imaging within a 30-dpf zebrafish.
figure 6

a Schematic of the imaging configuration for a whole-body zebrafish. The skeletal muscle structure of the head-trunk junction was imaged. We chose the imaging plane 180 µm behind the spinal cord to have the complex structures, such as skin, thick muscle layer, spinal cord, and cartilage along the beam path between the imaging plane and the imaging objective lens. The bottom inset shows the top view of the imaging configuration. b A typical high-resolution histological section of the skeletal muscle fibers at the head-trunk junction. The position of myosepta and the muscle-bone junction are indicated with dotted yellow and red lines, respectively. The dotted white arrow indicates the direction of an obliquely arranged muscle fiber. The bottom left inset shows the magnified view of an individual muscle fibers. The alternating light and dark bands of sarcomere are barely visible along the direction of the white arrow. The bottom right inset shows the magnified view of myosepta. Scale bar: 50 μm. The histological image is adapted from PennState Bio-Atlas database (http://bio-atlas.psu.edu/; http://bio-atlas.psu.edu/view.php?s=64&atlas=73). c, d Wide-field imaging without and with space gating, respectively. With space gating, the structural features of the myosepta (dotted yellow lines) and the muscle-bone junction (dotted red line) can be identified. Similar to b, the dotted white arrow indicates the direction of an obliquely arranged muscle fiber. Scale bar: 50 μm. ej Magnified views of the regions indicated in c and d. Space-gated images reveals the repeating unit of muscle fiber (i.e., alternating light and dark bands) arranged along the individual muscle fibers, whose direction is indicated with the white arrow. From the period of the alternating bands, the sarcomere length can be determined to be ~2 μm. Scale bar: 10 μm. k, l Reconstructed phase images of the regions of g and j, respectively. With space gating, random-phase noise is suppressed, and therefore, the complex winding structures of the muscle fibers can be identified from the phase discontinuity.

We reconstructed the image of skeletal muscle fibers over a large field of view of 780 µm × 200 µm by stitching multiple images (see Methods section for the detailed procedure for the coherent image stitching method). Without space gating, the structural features were not readily visible as multiply scattered waves introduced speckle-like artifacts (Fig. 6c, e–g). The effect of this noise becomes more noticeable toward the anterior side, as the internal structures of zebrafish becomes more complicated within the anterior side of head-trunk junction. In contrast, space-gated imaging provides the distinctive features of myosepta, muscle fibers, and sarcomere (Fig. 6d, h–j). Therefore, with space-gated imaging, one may determine some important structural parameters, such as the position and angle of myosepta, and the sarcomere length (see Supplementary Fig. 9 for one-dimensional profiles)37. Additionally, the space-gated image (Fig. 6d) clearly presents the attachment point of muscle fibers and occipital bone, which also appears in the histological image in Fig. 6b (indicated as the dotted red line).

As space gating reduces the random-phase noise of multiply scattered wave, the phase information of muscle fibers becomes clearly visible (Fig. 6k, l). The phase information was particularly useful as a complementary information to determine the discrete muscle fiber from a network of interwoven muscle fibers, illustrating the benefit of phase imaging within a scattering tissue. Phase imaging also allowed us to enhance the contrast of the individual muscle fibers with the reconstruction of a phase-gradient image based on the application of an asymmetric detection scheme38 to a complex field image (see Supplementary Fig. 10 for the phase map and the corresponding phase-gradient images). We also imaged sponge-like cartilage structures near the head of the whole-body zebrafish and consistently observed the effect of noise rejection by space gating and the benefit of phase information (Supplementary Fig. 11).

Discussion

Deep tissue space-gated microscopy, as implemented with the simple addition of an acoustic focus to a conventional microscopy, can be used to improve the imaging depth of a wide range of label-free imaging applications that rely on the intrinsic optical absorption and phase-gradient contrast of the specimen. In this work, we demonstrated wide-field-of-view imaging of a whole-body zebrafish and showed that the space gating rejects a significant portion of multiple scattering noise and reveals the important structural features, such as myosepta and sarcomere, even through the spinal cord located at the center of the body. This example illustrates the potential use39,40,41,42 of space gating to achieve histology-like imaging within a scattering tissue without any incision or staining procedure typically required for histological methods43,44. Additionally, the coherent nature of space-gated microscopy enables visualizing biological phase objects within deep tissue, which might directly benefit electrophysiology experiments.

The proposed space gating method is the first acousto-optic imaging approach relying on the selective and coherent detection of the ballistic waves. Therefore, its resolution is dictated by the ideal diffraction limit of the optical system, rather than the diffraction limit of the acoustic system. Although our scheme of space gating shares some components with the conventional ultrasound-modulated optical tomography24,25,26,27,28, our space-gated microscopy uses the acousto-optic effect in a completely different way. It is used for gating out the multiply scattered wave in ideal diffraction-limited imaging based on the confocal detection or coherent aperture synthesis. Similar to deep tissue photoacoustic approaches45,46, the conventional acousto-optic approaches24,25,26,27,28 rely on both ballistic and multiply scattered waves as a whole. Therefore, the imaging resolution is set by the acoustic diffraction limit, which was ~30 µm in our experiments. However, it should be noted that their imaging depth can be larger than the proposed method because they are not subject to the problem of competition between the ballistic and multiply scattered waves. There have been a few ingenious wavefront shaping methods that can improve the spatial resolution of acousto-optic or photoacoustic approaches to the optical speckle scale, using iterative optimization47,48,49 and variance-encoding32,33. However, these methods are easily compromised in practical situations, where the size of the speckle grain is as small as the optical wavelength or the acoustic focal profile does not have a well-defined peak. Those concepts have only been demonstrated for geometries, in which the gap between the scattering layer and the object plane is large enough for the speckle grains to be at least one order of magnitude larger than the wavelength32,33,47,48,49. In contrast, our method, which relies on a ballistic wave for image formation, allows us to obtain the ideal optical diffraction-limited resolution for objects completely embedded within a scattering medium, where the speckle grains are fully developed and on average close to half the wavelength in size. Furthermore, our method is much less sensitive to speckle decorrelation than acousto-optic wavefront manipulation techniques because the dynamic motion of the scatters affects the ballistic wave much less than the multiply scattered wave.

Because of the two-dimensional nature of space gating, the noise suppression factor η can be quadratically improved by reducing the size of the gating window wSG. Therefore, the use of higher frequency acoustic waves or second-harmonic acousto-optic interactions would greatly improve the imaging depth, although the reduced acousto-optic modulation efficiency may potentially hinder the proper measurement of acousto-optically modulated optical wave. The imaging depth can also be greatly improved by choosing a probe beam of longer wavelength at which ls is larger. First, it allows us to detect the ballistic wave at a proportionally larger L because the intensity of the ballistic wave follows the Beer–Lambert law dictated by L/ls. Secondly, and more interestingly, the effect of space gating would quadratically increase with L due to the associated increase in the spatial extent of the multiply scattered wave. Although our proof-of-concept experiments were performed at the 532 nm wavelength, where ls is relatively small for biological tissues50, the use of a longer-wavelength source would substantially increase the absolute imaging depth for biological applications. This space gating technique could also be adopted for other epi-detection configurations for more diverse applications in biological studies.

The resolution of the demonstrated imaging method is set by the diffraction limit of optical system. In the present study, the diffraction-limited resolution of 1.5 µm was set by geometric restrictions of the focused laser beam and acoustic transducer. The use of a physically smaller acoustic transducer would allow a higher numerical aperture (NA) for optical imaging. Novel aberration correction methods reported in previous studies could also be incorporated to retain submicron imaging resolution even for aberrating biological specimens14,15,16,17. In our experiments, the imaging speed is limited to 10 Hz per point by the laser repetition rate, the line scan time of the rolling shutter of the camera, and the scheme for the holographic measurement, but it can be improved up to 1000 Hz per point without much technical hurdles. Because the acoustic propagation time from the transducer to the acoustic spot is ~4 µs, the laser repetition rate can be increased up to 250 kHz while ensuring a single space gating for each optical pulse. Therefore, the camera exposure can be reduced down <1 ms from the current value of 10 ms because typically 100 laser pulses are sufficient for the accurate complex field measurement. The imaging speed can be further increased by twofold and fourfold, respectively, using a global-shutter camera and an off-axis holography51. In aggregate, those efforts will lead to ~100-fold improvement in imaging speed, which would be sufficiently fast for soft-tissue imaging.

To conclude, the imaging depth of microscopy has long been set by the ability of existing gating methods to reject multiply scattered waves. It has been particularly difficult to apply the method of phase imaging to the case, where transparent biological cells are fully embedded inside a scattering medium due to its susceptibility to multiple scattering. The proposed concept of space gating is a novel and independent gating scheme, which can effectively reject the multiply scattered wave that bypasses conventional gating operations (see Supplementary Note 3 to see when space gating is particularly beneficial). By taking the full advantage of this space gating, we could realize phase imaging of biological cells and fine tissue morphologies embedded within a thick biological tissue. Given that the space gating can be combined with all the existing gating methods for the optimal rejection of multiply scattered waves, further development and use of space gating will provide an important step toward reaching the ultimate imaging depth set by the detection limit of ballistic waves. And its capability of phase imaging in a thick scattering medium will facilitate the studies of the native physiology of biological cells within deep tissues.

Methods

Confocal imaging setup with acousto-optic space gating

For confocal imaging, we sampled the modulated signals at the camera pixel conjugate to the focused illumination. This confocal configuration is in effect identical to the conventional confocal scheme based on a physical pinhole. The NAs of the objective lenses on the illumination and detection paths were 0.18, setting the diffraction limit resolution of 1.5 µm. Three cycles of the focused acoustic wave whose frequency was fUS = 50 MHz was temporally synchronized with a 532-nm laser pulse of 7 ns width at a repetition rate of 40 kHz. The frequency bandwidth of the transducer was as wide as 40 MHz, which is wide enough to generate the short-pulsed sine wave with the correlation with respect to the ideal three-cycle sine wave >90 %. The NA of acoustic transducer was 0.47. With the acoustic pressure of a few megapascals, the spatial-peak-temporal-average intensity was ~150 mW2 cm−2, which is well below the safety limit of 720 mW2 cm−2 for biological applications.

Although the interferometric confocal detection provides the phase map of the ballistic wave, the phase drift during the focal scanning deteriorates the phase image of the object. Therefore, to achieve quantitative phase imaging, we switch the illumination beam to a plane wave and then vary the incidence angle for coherent aperture synthesis, where the coherent (i.e., both amplitude and phase) image is synthesized in such a way that the ballistic wave is collectively accumulated12,52. Note that once the ballistic wave has been properly accumulated for every incidence angle constituting the focused beam in confocal detection, the signal to noise ratio and the imaging resolution of the coherent aperture synthesis is identical to that of the confocal method. However, in most of our experiments for amplitude objects, we used the confocal scheme shown in Fig. 2a because it provides a higher signal to noise ratio for the initial detection of the ballistic wave before reconstructing the image.

Measurement of spatial extent of R and R SG

We illuminated a transparent sample composed of PAA gel with a plane wave. For the measurement of R without space gating, we switched off the reference beam and the focused acoustic beam, and then summed the intensity maps measured over 900 incidence angles. For the measurement of RSG with space gating, we performed the interferometric detection of the acousto-optically modulated waves for 900 incidence angles and summed the intensity of the measured complex fields.

Measurement of transfer functions

To measure the illumination transfer function |Ti(ro; ri)|2, we recorded the intensity map on the object plane using the camera shown in Fig. 2 while removing the scattering sample in the detection path. To measure the detection transfer function |Td(ro; rd)|2, we used the reciprocity of light propagation and the symmetry of our optical system. Based on reciprocity, the detection transfer function |Td(ro; rd)|2 is identical to the intensity map on the object plane for a virtual source placed at the detector point rd. Therefore, we removed the scattering sample from the illumination path and flipped the entire sample with respect to the object plane to take advantage of the symmetry between the input and output sides of our system. Finally, similar to the measurement of |Ti(ro; ri)|2, we recorded the intensity map on the object plane while illuminating the flipped sample with a focused beam.

Calculation of modulation efficiency

The measured interference intensity at the kth phase step (k is an integer number [0, 3]) can be expressed as \(I_k = \left| {E^{{\mathrm{ref}}}{\mathrm{exp}}\left( {i\frac{\pi }{2}k} \right) + E^{{\mathrm{sam}}}} \right|^2 = \left| {E^{{\mathrm{ref}}}{\mathrm{exp}}\left( {i\frac{\pi }{2}k} \right) + E_{{\mathrm{unmod}}}^{{\mathrm{sam}}} + E_{{\mathrm{mod}}}^{{\mathrm{sam}}}} \right|^2\) (ref. 31), where Eref and Esam are the complex amplitudes of the reference and sample waves, respectively, and \(E_{{\mathrm{unmod}}}^{{\mathrm{sam}}}\) and \(E_{{\mathrm{mod}}}^{{\mathrm{sam}}}\) are the unmodulated and modulated components of the sample wave, respectively. Then, the modulation efficiency is defined as \(\left| {E_{{\mathrm{mod}}}^{{\mathrm{sam}}}} \right|^2/\left| {E^{{\mathrm{sam}}}} \right|^2\). Considering the camera exposure is much longer than the acoustic oscillation period, the two interference terms involving \(E_{{\mathrm{unmod}}}^{{\mathrm{sam}}}\) are averaged out to a negligible level due to their oscillation at the acoustic frequency. Therefore, Ik can be written as \(\left| {E^{{\mathrm{ref}}}} \right|^2 + \left| {E^{{\mathrm{sam}}}} \right|^2 + 2\left| {E^{{\mathrm{ref}}}} \right|\left| {E_{{\mathrm{mod}}}^{{\mathrm{sam}}}} \right|{\mathrm{cos}}\left( {\phi + \frac{\pi }{2}k} \right)\), where ϕ is the relative phase between Eref and \(E_{{\mathrm{mod}}}^{{\mathrm{sam}}}\). Finally, the modulation efficiency is given by \(\left\{ {\frac{{\left[ {\left( {I_2 \,-\, I_0} \right) \,+\, i\left( {I_3 \,-\, I_1} \right)} \right]}}{4}} \right\}^2/\left( {\left| {E^{{\mathrm{ref}}}} \right|^2\left| {E^{{\mathrm{sam}}}} \right|^2} \right)\).

Preparation of scattering layers

To fabricate a scattering layer, a PDMS solution was thoroughly mixed with ZnO particles at a fixed concentration. The mixture was then transferred to a Petri dish and coated uniformly on the dish using a spin coater. Finally, the PDMS was cured at 60 °C. The scattering mean free path ls of the layer was 21 μm, which was measured by the ballistic transmission through the two distant diaphragms. The layer thickness was controlled by varying the volume of the PDMS mixture transferred to the dish and measured by a conventional bench-top microscope. The thickness ranged between 150 and 290 µm.

Imaging procedure of amplitude objects

The focused illumination beam was scanned over 16.1 × 16.1 µm2 with a step size of 0.54 µm using a pair of galvanometer mirrors. This resulted in 900 illumination spots. The confocal image of the object was then reconstructed from the intensity recordings at the detector pixels conjugate to the illumination point. The amplitude objects used were 2-µm gold-coated silica microspheres with transmittance of ~10 % at 532-nm wavelength.

Calculation of ratio τ and τ SG using PSFs

The detected confocal intensity 〈|E(rd = ri; ri)|2〉 equals 〈|ES(rd = ri; ri)|2〉 + 〈|EM(rd = ri; ri)|2〉 because the cross term between the ballistic and multiply scattered wave converges to 0, with an ensemble average denoted by 〈〉. 〈|EM(rd = ri; ri)|2〉 here can be separately determined by the intensity in the vicinity (rd ~ ri) of the illumination spot because 〈|EM(rd; ri)|2〉 varies slowly with rd.

The ratio of the ballistic waves to the multiply scattered waves was calculated using two methods, depending on the visibility of the focused ballistic wave. When the focused spot was clearly visible (i.e., the peak to background ratio was >5), the detected confocal intensity 〈|E(rd = ri; ri)|2〉 and the 〈|EM(rd ~ ri; ri)|2〉 could be quantified directly from the PSFs. τ and τSG were respectively determined as \(\frac{{\left[ \langle{\left| {E\left( {{\mathbf{r}}_{\mathrm{d}} \ = \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2\rangle \ - \ \langle\left| {E_{\mathrm{M}}\left( {{\mathbf{r}}_{\mathrm{d}}\ \sim \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2}\rangle \right]}}{\langle{\left| {E_{\mathrm{M}}\left( {{\mathbf{r}}_{\mathrm{d}}\ \sim \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2}\rangle}\) and \(\frac{{\left[ \langle{\left| {E^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}} \ = \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2\rangle \ -\ \langle\left| {E_{\mathrm{M}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}}\ \sim \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2}\rangle \right]}}{\langle{\left| {E_{\mathrm{M}}^{{\mathrm{SG}}}\left( {{\mathbf{r}}_{\mathrm{d}}\ \sim \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2}\rangle}\). However, when the focused spot was not clearly visible, such as in Fig. 2g, 〈|ES(rd = ri; ri)|2〉 cannot be precisely estimated by 〈|E(rd = ri; ri)|2〉 − 〈|EM(rd ~ ri; ri)|2〉. In this case, 〈|ES(rd = ri; ri)|2〉 was estimated by \(I_0\exp ( - L_{{\mathrm{tot}}}/l_{\mathrm{s}})\), where I0 is the measured peak intensity through a transparent specimen. Then, τ was estimated as \(\frac{{I_0\exp ( - L_{{\mathrm{tot}}}/l_{\mathrm{s}})}}{\langle{\left| {E_M\left( {{\mathbf{r}}_{\mathrm{d}}\ \sim \ {\mathbf{r}}_{\mathrm{i}};{\mathbf{r}}_{\mathrm{i}}} \right)} \right|^2}\rangle}.\)

Preparation of embedded objects

A thin PAA gel layer mixed with gold-coated microspheres was sandwiched between two 3-mm-thick PAA gel slabs containing a 0.8% fat emulsion (Intralipid). The total optical thickness Ltot/ls of the 6-mm-thick PAA gel was measured to be 21.0. Similarly, we prepared human red blood cells sandwiched between PAA gels with a 0.8% fat emulsion (Ltot/ls ~ 21) to mimic biological conditions. We recorded the speckle pattern at the object plane with a 1.4-NA objective lens in the absence of a PAA slab on the detection side and determined the average grain size at the object plane from the FWHM of the autocorrelation function of the speckle pattern.

Reconstruction of extended field-of-view image

To image different parts of the whole-body zebrafish, the sample cuvette containing the zebrafish was mounted on a three-axis-motorized translation stage. For each sample position, individual images were captured and processed using coherent aperture synthesis following the procedure described in the Results section. The size of the individual image along x- and y-axes was 130 µm × 130 µm without space gating and 50 µm × 80 µm with space gating, respectively, set by the number of active sensor elements and the size of space gating. To acquire a large field-of-view image, the zebrafish was translated with a step size of 65 µm × 65 µm and 25 µm × 40 µm along x- and y-axes, respectively, without and with space gating. Finally, the individual images were coherently combined into a complex, large field-of-view image based on the autocorrelation of the overlapped area between the adjacent images.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.