## Introduction

A fundamental property of waves is that they diffract. Spatially modifying the phase or amplitude of an incident wave can be used to focus the wave or to form a diffraction image with the desired intensity distribution. While it is possible to dynamically modify the phase and amplitude of light waves with the help of a spatial light modulator (SLM)1,2, it has proven challenging to similarly control sound waves. There exist a variety of methods to dynamically tune the phase and amplitude of light through techniques, including phase retardation in liquid crystals3, geometric-phase tuning via metasurfaces4,5, and binary switching of reflected amplitude via micro-electromechanical systems6. Acoustic waves possess no polarization and show no or little dispersion from the low audible kHz to the very high MHz ultrasound frequencies7, which considerably complicates the realization of a spatial modulator for sound waves analogous to an SLM.

Recently, it has been demonstrated that static-phase plates, or holograms8, can modify an ultrasound field at high resolution with more than 10,000 pixels across the wavefront. This considerably increases the complexity of the projected static ultrasound fields, which has enabled first demonstrations of acoustic fabrication9 and the assembly of cells10 into designed patterns, beam steering11, and the compensation of wavefront aberration in transcranial focusing of ultrasound12. The ability to dynamically update and adjust these complex ultrasound fields with the aid of a high-resolution spatial ultrasound modulator (SUM), would present a major advance for these and related applications, which include medical imaging deep inside the body13,14, nondestructive testing of opaque solids15, the manipulation of submicron particles16,17, biological cells18,19, and even centimeter-sized objects20.

The realm of audible acoustics has seen some notable developments in this regard. Ma et al. demonstrated a metasurface of membrane-type resonators to dynamically control and reshape a reverberating sound field in a room21. Another system reported by Tian et al. used an array of tunable Helmholtz resonators to steer and focus transmitted acoustic waves22. The large wavelengths of audible acoustic waves relative to the region of interest result in a differently scaled problem with low degrees of freedom, where a small number of larger actuators is sufficient. This is contrary to the previously mentioned applications of high-frequency ultrasound, which benefit from large numbers of much smaller pixels. The conventional device for ultrasound beam shaping is the phased array transducer (PAT)23, which uses many individually controllable sound emitters to directly generate arbitrary and dynamically tunable wavefronts via superposition. PATs have been shown to efficiently implement dynamic holograms and project the complex trapping fields that enabled acoustic tweezers24. Taking advantage of their fast update rate, PATs can generate multiple traps via time multiplexing25 or relocate a single trap occupied by a particle at high speed of several meters per second26. However, the complexity in the driving circuit limits the total number of PAT pixels to <1000. This is well below the number of elements that would be needed to enable sophisticated control of an ultrasound wave. Further, having the sound wave generation and the shaping of the wave in the same device increases the complexity and limits the development of high-power devices with many degrees of freedom. Spatial ultrasound modulation could solve this problem, as it would decouple the generation of the ultrasound wave from its modification, and thus would permit the use of an optimized single-element transducer.

Here, we introduce a dynamic SUM based on digitally generated microbubbles on a complementary metal–oxide–semiconductor (CMOS) chip surface. Due to the strong acoustic impedance mismatch between a gas bubble and the surrounding liquid27,28, we modify the transmission of an acoustic wave with programmable microbubbles, in analogy to the digital mirror device (DMD) for spatial light modulation29. We write binary amplitude holograms with 10,000 digitally addressable microbubble pixels on the CMOS chip surface within 12 s through water electrolysis. Between frames, the SUM surface is mechanically reset, which allows us to realize the first high-resolution animation of sequential acoustic images. We demonstrate the versatility of a SUM by assembling microparticles into complex shapes.

## Results

### Principle of spatial ultrasound modulation via a microbubble array

Implementing a dynamically reprogrammable phase plate similar to the static acoustic hologram8 is an engineering challenge. The obvious approach through, e.g., deformable surfaces30,31, requires the integration of many actuators with spacing and displacements at the ultrasound wavelength scale. Alternatively, controlling dispersion could efficiently modulate the phase of an ultrasound wave, but no suitable material or meta-material concept has been found to date. Amplitude modulation promises a more viable solution instead of phase29. Though a binary amplitude hologram contains only two states for each element, which decreases its information capacity compared to multiple-level phase modulation, it could still afford complex image generation, simply by providing many more elements in total29.

Due to the significant acoustic impedance mismatch between gas and liquid, a thin layer of air in liquid can effectively stop ultrasound, even when its thickness is less than the acoustic wavelength. A microbubble can thus serve as a local sound blocker. A pattern of microbubbles in the path of an ultrasound wave should, therefore, impart a corresponding amplitude pattern onto the wavefront of the acoustic field, which is the operating principle of our SUM, as shown in Fig. 1a. Patterning a large number of microbubbles enables the on-demand shaping of an acoustic field’s amplitude distribution (Fig. 1b). Moreover, the dynamic control of the microbubble pattern enables dynamic spatial acoustic modulation. Based on this concept, our dynamic spatial ultrasound modulator (SUM) generates reconfigurable microbubble patterns.

For example, even a 20-μm gas layer leads to a negligibly small transmission coefficient (on the order of 10−7), considering a 10-MHz acoustic wave (wavelength 150 µm). This can be seen from the power transmission coefficient for an acoustic wave at normal incidence through a plain layer32:

$$C_T = \frac{1}{{\xi ^2\sin ^2\left( {k_L\delta } \right) + 1}},$$
(1)
$$\xi = \frac{1}{2}\left| {\frac{{Z_L}}{{Z_M}} - \frac{{Z_M}}{{Z_L}}} \right| = \frac{1}{2}\left| {\frac{{\rho _Lc_L}}{{\rho _Mc_M}} - \frac{{\rho _Mc_M}}{{\rho _Lc_L}}} \right|,$$
(2)

where δ and kL are the layer thickness and wavenumber in the layer material, respectively; Z, ρ, and c are acoustic impedance, density, and speed of sound, respectively; the subscripts L and M indicate the layer and the surrounding host medium. The sound speed in water (ρM ~ 1000 kg m−3) is cM ~ 1500 m s−1 and in air cL ~ 343 m s−1 at atmospheric pressure (ρL ~ 1.23 kg m−3). As the ratio of acoustic impedances increases, the wave is increasingly reflected at the interface, and therefore, less energy is transmitted through the layer. Since air blocks ultrasound so well, we now need to find a way to generate programmable on-demand microbubble patterns.

Our SUM device architecture consists of a CMOS chip placed on top of an acoustic transducer, as shown in Fig. 2. A liquid film of electrolyte is sandwiched between the chip surface and a conveyor film. The CMOS chip surface has 10,000 individually addressable electrodes (70 μm by 70 μm gold pads in a 100 µm by 100-µm raster). Positioned next to the chip is a copper electrode, which serves as the anode. A switchable DC power supply provides a potential difference between the copper electrode (+5 V) and the 10,000 gold electrode pads of the CMOS chip. Once the DC power is switched to a CMOS pixel, the electrolysis of the surrounding water solution generates hydrogen and oxygen gas, respectively, at the gold and copper electrodes. As we will see below, the current is controlled to define the size of the microbubbles.

To generate a target acoustic field, we first compute a binary amplitude hologram8, which is a binary transmission function that can be directly translated into a pattern of microbubbles. The CMOS chip then generates microbubbles according to this pattern. Each microbubble corresponds to a location of zero ultrasound transmission (Fig. 2a). After the bubble generation is completed, the transducer is turned on (Fig. 2b), and the acoustic wave transmits through the SUM and is locally blocked at the pixels that are covered by a microbubble. The remainder of the wavefront propagates into the upper container and diffracts to form the target sound pressure distribution. To visualize the pressure field at the target plane, we introduced submillimeter PDMS particles suspended in water, which then assemble into the shape of the projected sound pressure image. To conclude the sequence and prepare the SUM for the next frame, the microbubbles are cleared by horizontally translating a conveyor film (Fig. 2c), which drags the bubbles out of the device. The complete modulation process is shown in the Supplementary Movie 1.

### Microbubble generation

The SUM generates a pattern of microbubbles on the surface of the CMOS chip by the electrolysis of water. The microbubble coverage has to be large enough to ensure that the acoustic wave is blocked at the location of the electrode. As the potential difference between the anode and the cathode is constant (5 V), the microbubble volume depends on the time the current flows. The size of the microbubbles as a function of the time of the electrolysis (0.6, 0.8, 1.6, 2.4, and 2.8 ms) is shown in Fig. 3. The area (XY plane) covered by microbubbles increases with the duration of the electrolysis. An adequate microbubble volume also ensures that the bubble is trapped between the conveyer film and the chip surface. The adherence to the solid surfaces appears quite strong and retains the microbubbles against buoyancy even when the device is turned to a vertical orientation33. This suggests that the operability of our SUM is independent of its orientation, as shown in Supplementary Fig. 1. However, as the microbubbles grow, neighboring bubbles can fuse, which is shown in Fig. 3g. This distorts the microbubble pattern because the resulting merged bubbles adopt a spherical shape due to surface tension. We empirically determined that a flow of current between 1.6 and 2.4 ms, marked in blue in Fig. 3g, maximizes the bubble coverage while keeping the fusion of bubbles low.

Figure 3h shows the simulated relative acoustic transmission coefficient for different bubble coverages across a single pixel. The relative acoustic transmission coefficient is the ratio of the acoustic intensity transmitted through a bubble covered pixel versus an uncovered pixel. It can be seen how the selected bubble coverage (marked blue), resulting from the selected electrolysis time, effectively blocks 99% of the incident acoustic intensity. It should be noted that the applied acoustic frequency (10 MHz) is far above the fundamental resonant frequency (on the order of 100 kHz) for 10-μm-sized microbubbles in water34. Thus, the bubble vibration excited by the incident acoustic waves is negligible35. Accordingly, we do not observe bubble motion even when the intensity at the transducer reaches about 5 W cm−2, which is sufficiently high for microparticle assembly and manipulation.

### Binary amplitude acoustic hologram

For each acoustic image, the microbubble pattern is pre-calculated as a binary amplitude acoustic hologram, consisting of pixels with an amplitude of zero or one. Similar to a phase hologram (Fig. 4a, d)8, the binary amplitude hologram (Fig. 4b, e) can also be optimized using the iterative angular spectrum approach (IASA). In this special case, however, the phase distribution in the hologram plane is at each step converted to a binary amplitude distribution with a fixed phase. An average phase value is obtained from the back-propagated target image. The hologram pixels, whose original phase is within the range of ±π/2 from this average value, are set to an amplitude of one, and the remaining pixels are set to zero. The algorithm typically converges in <30 iterations.

Figure 4 shows simulations of reconstructed sound fields, and their corresponding holograms for a phase hologram (panels a, d) and binary amplitude hologram (b, e) encoding the letter “R”. Since the pixels in binary amplitude holography only have two states (Fig. 4e), they naturally provide much less information density than phase holograms which provide almost a continuous modulation over a range of 2π (Fig. 4d). This results in an elevated background noise that can be seen when comparing Fig. 4b with 4a. On the SUM chip surface, microbubbles replicate the zero-amplitude pixel pattern designed by the binary amplitude holography (Fig. 4f). The 10-MHz transducer emits an acoustic plane wave that transmits through the chip layer and then reaches the microbubble layer. The wave is blocked by each bubble and therefore modulated in amplitude. Where there is no microbubble, the wave transmits and diffracts to form the calculated acoustic image in the target plane (Fig. 4c). To demonstrate that the SUM can be used to project changing acoustic fields, we show a movie of the corresponding hydrophone scans in the Supplementary Movie 2. In this video, each frame was formed in 15 s, and the resulting field was raster-scanned by a needle hydrophone before clearing the bubble pattern and creating the next frame.

### Dynamic microparticle manipulation based on SUM

Acoustic particle manipulation is an emerging technique with promising applications in fabrication36 and biomedical engineering18. To date, however, methods for dynamic and parallel manipulation have been limited to few particles25 or highly symmetric arrangements37. As shown in Fig. 5, the present SUM is capable of dynamically assembling microparticles into arbitrary target patterns. We use PDMS particles, which have a positive acoustic contrast in water. Thus, the acoustic radiation force on these particles will push them toward areas of high acoustic amplitudes. For each acoustic image, it takes around 12 s to write the microbubble hologram, when each pixel is sequentially addressed. Afterward, the transducer is turned on for 15 s, generating ultrasound waves, which are modulated by the SUM and propagate to form the acoustic image in the target plane, where the PDMS particles aggregate into the corresponding shape. After each assembly step, the transducer is turned off, and a motorized film mechanically “wipes” the microbubbles off the chip surface. In one experiment, the sequence of microbubble writing, particle assembly and bubble removal is repeated seven times to sequentially assembly the particles in the shape of the letters “A” to “G”. A video of this dynamic microparticle manipulation is shown in Supplementary Movie 3.

## Discussion

In summary, we demonstrate the first dynamic SUM, which can be used to generate arbitrary images out of sound. The SUM has 10,000 active elements that are digitally controlled to form microbubbles via electrolysis. We show that the SUM can generate binary amplitude transmission holograms. Hydrophone scans of the projected ultrasound fields are in excellent agreement with simulation results. The projected acoustic fields can be updated and used to assemble microparticles in pre-defined shapes. Currently, the elements of the chip are sequentially addressed, which leads to relatively slow update cycles, but parallel pixel addressing38 is expected to drastically increase the refresh rate. To meet the requirements of portable biomedical devices, the bubble removal method can be implemented by other means, e.g., forced fluid flow of the electrolyte or on-chip reversal of the electrolysis39. Future work should explore multilevel amplitude or phase control of sound waves exploiting the resonant behavior of the microbubbles at specifically controlled sizes40,41,42,43. Spatial ultrasound modulators extend the capabilities of ultrasound applications and will be essential for medical imaging13,44, nondestructive testing15, holographic acoustic tweezers8,25, transcranial ultrasonic focusing12, acoustic fabrication9 and cell assembly10.

## Methods

### The CMOS chip

The CMOS chip consists of an array of 100 by 100 gold electrodes with a size of 70 µm by 70 µm. Under each electrode, a CMOS transmission gate connects the electrode to a vertical wire. Outside the electrode array, additional transmission gate switches collect the column wires into eight global wires, which lead to the chip pads and can be accessed from the outside of the chip. Two shift register chains, respectively, for row and column select, are fed by a digital driving signal to control the transmission gate groups. The chip is driven by a commercial microcontroller board (Arduino Mega 2560), which is loaded with the codes for chip electrodes addressing and electrolysis voltage switching. The thickness between the conveyor film and the chip surface is estimated to be 20 μm. A 2-μL electrolyte droplet is squeezed between the conveyor film and the chip surface under the experimental conditions, whose spread area is measured as 1 cm2.

The chip was produced by a classical 0.8-µm channel length CMOS technology. This p-well technology incorporates local oxidation of silicon device isolation, a single polysilicon layer as the gate electrode, and two Aluminum layers for interconnects with a total of 15 optical lithography steps. In addition, two lithography steps specialized post-processing was used for the gold electrodes45.

### Hydrophone scan of target acoustic field

The acoustic pressure field is mapped by hydrophone scanning. The transducer and the chip are immersed in a tank containing the electrolyte (80 mg mL−1 aqueous K2SO4 solution). The wiring of the PCB board is waterproofed with a cured polydimethylsiloxane (PDMS) covering. The bottom of the chip is placed in contact with the transducer. A 10-MHz AC signal with 5 Vpp amplitude is applied to the transducer (I3-1008-S-SU, ultrasound aperture 11 mm, Olympus Corporation, Japan). The generated acoustic waves transmit through the chip containing the microbubble pattern. The needle hydrophone (0.2 mm diameter, Precision Acoustics Ltd., UK) measurement across the imaging plane scans each point for 0.1 s, during which the signal from the hydrophone is amplified and filtered by a lock-in amplifier (Zurich Instruments, Switzerland). The scan area is 60–100 mm2, with a lateral resolution of 0.08–0.1 mm. A typical scan is completed in 30–60 min.

### Acoustic simulations

To simulate the transmission of the acoustic wave through the bubble layer, a finite element method (FEM)-based numerical simulation was conducted using COMSOL Multiphysics 5.3 acoustic-solid interaction module. The modeling schematic is shown in Supplementary Fig. 2. Briefly, a domain defined with gas properties simulates the gas bubble sandwiched between two solid interfaces. It is immersed in a cuboid domain of water. Close to the gas bubble, a cuboid domain of silicon is defined to simulate the chip. A 10-MHz vibration is located at the bottom surface of the chip. The acoustic wave transmits through the silicon chip, the gas bubble layer, and the water, and its far-field intensity is calculated. The remaining exposed boundaries are defined as symmetric boundaries.

### Microparticle patterning

PDMS microparticles are generated by homogenizing 10:1 weight ratio pre-polymer and curing agent (Sylgard184 Silicone Elastomer Kit, Dow Corning Corp., Freeland, MI) in 70 °C deionized water for 1 h. The setup used for the patterning of microparticles is shown in Supplementary Fig. 3. The 10-MHz AC signal from a function generator is amplified to 5 W by a power amplifier and applied to the transducer. The chip is placed on top of the transducer with a thin layer of glycerol for acoustic coupling. The aqueous K2SO4 solution is pipetted onto the chip surface, and then a plastic thin film is sandwiched between the chip and a 3D-printed container. To refresh the SUM and remove all microbubbles, the thin film is horizontally dragged across the chip surface by a stepper motor. Another container with a transparent plastic film bottom, which is filled with water, is put on the suspension container. This is to define the target acoustic image plane (the bottom of the transparent plastic film bottom) and reduce the acoustic wave reflection from the top liquid–air interface.