Introduction

Accurate estimation of sound-source location facilitates communication, finding prey and escape from predators in hearing animals1,2. Directional cues are used for sound-source localization, such as interaural intensity difference (IID, sound amplitude ratio between the two eardrums, a.k.a. interaural level difference ILD) and interaural time difference (ITD, differences in time of arrival)2,3. These cues are proportional to the interaural separation, so there exists a fundamental size constraint for sound-source localization and small animals, especially insects, face formidable challenges2,4,5. While the average interaural separation for a human is 17.5 cm, for insects it is 1 cm or less, resulting in indiscernible ITD and IID2. Furthermore, because of their small head sizes, insects have too few neurons to carry out sophisticated signal processing2,4,5.

In nature, one striking innovation to overcoming the size constraint is found in the parasitic fly Ormia ochracea6,7,8,9,10,11. Although the separation between its auditory organs is a mere 520 μm, the fly can accurately localize the 5 kHz calling song of its host male crickets with a directional resolution of ±2°, which is equivalent to that of humans7,9,10,11. The key to the fly's exceptionally acute directional hearing has been found to be intertympanal mechanical coupling: the two eardrums of the fly ear are coupled by a cuticular bridge pivoted at the middle11. As a consequence, the time and amplitude differences between the two tympanal responses (mechanical ITD (mITD) and mechanical IID (mIID)) are greatly amplified, from a best possible ITD of 1.5 μs for an uncoupled system up to 50–60 μs and from an IID of less than 1 dB up to 12 dB11. However, based on the fact that the fly's turning speed is a sigmoid function of sound azimuth, it is believed that the fly can only localize the source when its head front (midline in Figure 1a) is within a certain azimuth range; beyond this range, the fly can only determine whether the sound is arriving from the left or right (i.e., it can only perform lateralization)7. Based on the shape of the sigmoid, the localization range is approximately −30° to 30°. This unique localization-lateralization scheme of the fly appears to be a compromise that greatly improves performance within the ±30° range at the cost of performance outside of this range.

Figure 1
figure 1

Dual-optimality of the fly ear and a fly-ear inspired sensor.

(a), Schematics of the fly ear structure and the lumped parameter model of the fly ear (redrawn from ref. 11). (b), The two vibration modes of the fly ear (redrawn from ref. 11). (c), Dual optimality of the fly ear achieved at the frequency of the cricket's calling song; that is, maximum average directional sensitivity (ADS) and minimum nonlinearity (NL) simultaneously achieved at 5 kHz. The inset shows the directional sensitivity (DS) at three different frequencies. (d), Natural frequencies (normalized by the optimal working frequency) determined through optimization analysis to ensure the dual-optimality characteristic as a function of the wavelength-to-separation ratio χ for two damping scenarios: i) ξ1 = 0.89, ξ2 = 1.23 and ii) ξ1 = 0.18, ξ2 = 0.05. The two cases marked by the red dots correspond to working frequencies of 5 kHz in i) (the fly ear) and 8 kHz in ii) (a low damping device). (e), Phase difference mIPD at 5 kHz as a function of azimuth for different coupling strength scenarios: stiff (natural frequency ratio η = 20), medium (η = 4.36; i.e., the fly ear case), soft (η = 2) and uncoupled (η = 1). The results were obtained by using the fly ear's structural parameters with varying bridge stiffness k3. (f), Frequency spectra of ADS and NL for i) soft coupling and ii) stiff coupling. (g), Dual optimality of a fly-ear inspired sensor designed to work at 8 kHz.

Since there are aspects of the fly ear that are still a puzzle, in this article, our goal is to answer the following important but largely unexplored fundamental questions. i) How are the structural parameters (e.g., stiffness, damping) of the fly ear tailored to achieve its superior localization ability at 5 kHz? ii) Does the fly ear represent an optimal structure for localization at 5 kHz? iii) Does the fly's localization-lateralization scheme represent an optimal way to relieve the size constraint? iv) How can a synthetic device be developed to replicate the optimal characteristics of the fly ear? The answers to these questions will not only help further the understanding of the underlying biophysics of the fly ear's hearing, but they will also lead to a new approach to tackle the long-standing size constraint in sound-source localization systems encountered in engineered systems.

Results

Bio-physics of the fly ear: dual-optimality

Our starting point was a normalized formulation of a lumped parameter model of the fly ear11, shown in Figure 1a. Modeled as two mass-spring-damper systems coupled by a spring-damper combination, the fly-ear structure has two vibration modes (Figure 1b): the rocking mode (the two membranes move 180° out of phase) and the bending mode (the two membranes move in phase)11. Although the lumped parameter model and the mode shapes of the fly ear have been reported in the literature11, we followed an approach different from that used in the existing literature by performing modal analysis12. This allowed us to obtain analytical close-form expressions for the directional cues of the fly ear in the frequency domain and thus understand how the structural parameters of the ear affect its performance.

To study the localization performance of the fly ear with respect to frequency, the mechanical interaural phase difference (mIPD), instead of the time difference mITD, was chosen as the directional cue for investigation. The mIPD, which is a dimensionless measure directly related to the mITD, is independent of the sound wavelength and speed. In response to a pure tone (frequency f) and incident azimuth angle θ (Figure 1a), the modal analysis showed that

where the modal response ratio Γ and the initial phase difference ϕ are given by

and

Here, η is the natural frequency ratio (f2/f1, the ratio of the bending mode natural frequency f2 to the rocking mode natural frequency f1), is the separation-to-wavelength ratio (d/λ, the ratio of the membrane center separation d to the sound-source wavelength λ), Ω is the normalized working frequency (f/f1, the ratio of the sound-source frequency f to the rocking mode natural frequency f1) and and are, respectively, the damping ratios of the rocking and bending modes. Note that the modal force ratio (rocking mode to bending mode) is jtan(ϕ/2), which indicates that there is a 90° phase difference between the two modal forces. The localization performance depends not only on the value of mIPD, but even more importantly, on the variation of mIPD with respect to the azimuth (i.e., ∂mIPD/∂θ), namely the directional sensitivity (DS), which determines how accurately the fly can pinpoint a source.

Putting the fly ear's structural parameters11 into our model, we found that at 5 kHz (the calling song frequency of the fly's host cricket), the fly ear can achieve not only a constant DS for azimuth angles between −30° to 30° but it can also obtain a higher DS in this azimuth range than that obtainable at other frequencies (e.g., 2 kHz and 8 kHz), as shown in the inset of Figure 1c. To further investigate this result, we defined two new performance metrics that have not been considered in prior studies, the average of DS (ADS) over the azimuth range −30° ≤ θ ≤ 30° and the nonlinearity (NL) of mIPD over this same azimuth range. The ADS represents the slope of a linear approximation of mIPD as a function of θ and NL is the resulting average error of this azimuth estimation (i.e., the deviation from the linear estimate) (see the Supplemental Materials for more details). When these two metrics of the fly ear are plotted in the frequency domain, an interesting result is revealed, as shown in Figure 1c: the minimum NL and the maximum ADS are achieved simultaneously at 5 kHz. This result provides the insight that the fly ear is endowed with a dual optimality characteristic at its working frequency of 5 kHz.

We further explored how the structural parameters of the fly ear are tailored to achieve such a dual optimality characteristic and whether a synthetic device endowed with the fly ear's dual optimality characteristic could be developed. An optimization problem was formulated to seek solutions that met the objective of achieving, simultaneously, minimal NL and maximal ADS at the selected working frequency over the azimuth range −30° to 30°. As noted previously, there are several key dimensionless parameters that influence NL and ADS: the natural frequency ratio η, the separation-to-wavelength ratio χ and the damping ratios ξ1 and ξ2. In Figure 1d (i), for the fly ear's damping parameters (ξ1 = 0.89, ξ2 = 1.23), the rocking and bending mode natural frequencies that ensure the dual optimality characteristic are plotted as a function of χ. For a given working frequency and/or device size, this plot shows the natural frequency combinations that are required for optimal performance. Based on the fly-ear geometry and its working frequency of 5 kHz and following the two curves in Figure 1d, the natural frequencies required to achieve the dual optimality are obtained (6.99 kHz for rocking mode, 30.10 kHz for bending mode); these predictions are in excellent agreement with the experimental data reported in the literature (7.12 kHz and 31.00 kHz)11. This finding provides the basis for making the following statement: the fly ear represents an optimal structure that can simultaneously achieve the maximum DS and the minimum NL at its working frequency of 5 kHz.

Furthermore, we found that to achieve the dual optimality characteristic, contributions from both the rocking and the bending modes are necessary. Note that the natural frequency ratio η is related to the stiffness ratio σ = k3/k1 by η2 = 1 + 2σ, where σ quantifies the coupling strength between the two membranes. As shown in Figure 1e for the fly ear's separation-to-wavelength ratio, if the coupling is soft (η = 2), the phase difference is somewhat larger than that for the uncoupled case (dashed blue line), but it is still insignificant. On the other hand, when the coupling is stiff (η = 20), mIPD is greatly amplified, but it saturates rapidly to ±180° when θ is slightly off the 0° midline, making it impossible to distinguish between azimuth angles. Figure 1f shows the NL and ADS for these two cases. For soft coupling, ADS is small at all frequencies and there is no maximum. For stiff coupling, the maximum ADS occurs at the rocking mode frequency, but the nonlinearity is actually highest there. Thus, for both stiff and soft coupling, dual optimality (max ADS and min NL) cannot be achieved. Only for medium coupling (η = 4.36) can the fly ear achieve a balance between ADS and NL, rendering the dual optimality at its working frequency (see Figure 1c). This suggests that the structural parameters of the fly ear have adapted in the course of evolution to give a proper coupling strength for achieving the dual optimality characteristic.

The dual optimality provides a basis for understanding the fly's superior directional hearing as well as its localization-lateralization scheme. First consider DS. As evident from Figure 1e, although the absolute value of the mIPD is maximal at the two extreme positions (θ = ±90°), the corresponding DS is close to zero at these positions and the maximal DS is actually achieved at the midline. Therefore, the fly naturally turns its head towards the source so that the maximum DS can be achieved for best localization precision. Evidence for the importance of DS has also been seen in the Egyptian fruit bat, which uses not the maximal sonar beam intensity but its maximal slope for target localization13. Second, when considering NL, mIPD is a linear function of azimuth in the range −30° to 30°, in agreement with the sigmoidal relationship of the fly's turning speed with respect to the azimuth7. Given limited neural processing ability, a linear and maximal DS will help the fly perform the localization task faster and more accurately. Third, our model also shows that the range −30° to +30° is optimal (see the Supplemental Materials for more details). For a wider range of angles, there is no improvement in ADS and NL becomes too large for accurate localization. For a narrower range of angles, there is no obvious improvement in ADS and NL, and furthermore the narrower range would require the fly to make more frequent turns. Therefore, it is not only the mechanical coupling mechanism that helps the fly ear obtain significantly amplified directional cues11, but more importantly, the structural parameters of the fly ear have been tailored to achieve the dual optimality characteristic at 5 kHz; this facilitates a unique localization-lateralization scheme for the fly that allows it to overcome its small size constraint and accurately pinpoint its host.

Mimicking the fly ear's dual optimality

Based on Figure 1d, the fly ear parameters are not the only ones that ensure dual optimality. For a given separation-to-wavelength ratio χ, the required natural frequencies for dual optimality can be obtained. Note that the plot in Figure 1d only covers χ from 0.01 to 0.04. When χ is larger (i.e., larger device size or lower working frequency), dual optimality cannot be achieved. However, amplification is not needed in that case: a system without mechanical coupling will have sufficient directional cues for localization. At the other end of the spectrum when χ < 0.01 (smaller size or higher frequency), an optimal structure can be found, but the amplified phase difference (mIPD) will still be too low for accurate sound-source localization. Furthermore, as noted previously, the result in Figure 1d(i) was obtained for large damping factors (ξ1 = 0.89, ξ2 = 1.23). Similar dual optimality can also be achieved for low damping (ξ1 = 0.18 and ξ2 = 0.05, see Figure 1d (ii)). Therefore, the results of Figure 1d provide a framework that enables the creation of synthetic devices with dual optimality that can be tailored to work at any chosen frequency or with any size. For example, by using Figure 1d (ii), one can design a low damping synthetic device (χ = 35.8) with a membrane center-to-center separation of 1.2 mm. As can be seen from Figure 1g, this device indeed possesses dual optimality at the designed working frequency of 8 kHz.

On the basis of our new understanding of the fly ear mechanism and our previous work on a large-scale prototype14, we developed a fly-ear sized micro-electro-mechanical system (MEMS) sensor to represent the low damping scenario of Figure 1g, which was designed to achieve dual optimality at an 8 kHz working frequency.

Since the discovery of the mechanical coupling mechanism in the fly ear11, there have been a multitude of research efforts devoted to the development of fly-ear inspired acoustic sensors15,16,17,18,19,20,21,22. In particular, Miles and his co-workers have presented pioneering work on the development of miniature pressure gradient microphones15,16,17,22. In their design, a rigid plate supported on a flexible pivot was employed to sense minute pressure gradients with typical device dimensions of 1 mm × 2 mm. These devices were designed to operate near the rocking mode natural frequency. By design, the natural frequencies of the bending and rocking modes were well separated and the response was dominated by the rocking mode. The plate response was measured by using diffraction-based optical detection. The response pattern was seen to resemble the same figure eight directivity pattern of a conventional pressure gradient microphone23. In another study, Touse et al. presented a similar device with two square wings (1 mm2) connected by a 500 μm wide bridge18. Different from the earlier study16, the device response was dominated by the bending mode. A capacitive readout with comb fingers was employed to detect the vibration of the wings. This device was later modified to include asymmetric wings19 so that approximately equal response components were obtained at the natural frequencies of the rocking and bending modes. The goal was to be able to operate the device at either the rocking mode or the bending mode. Detection of the wing response was not achieved by using an integrated readout system, but instead, by using an external laser vibrometer.

It should be noted that all of the above mentioned devices were designed to operate near either the rocking mode natural frequency15,16,17,19 or the bending mode natural frequency18,19. By contrast, as discussed previously, the fly-ear structure needs to have a proper combination of response components associated with both the rocking mode and the bending mode to realize the dual optimal performance. Furthermore, in the previous devices, the diaphragm deflection magnitudes were used to determine sound azimuths. In addition to the sound azimuth, the diaphragm deflection amplitude is also a function of the input sound intensity. For this reason, these devices have to be combined with another omni-directional microphone for measuring the sound pressure in order to exclusively determine the sound azimuth.

Our sensor device differs conceptually from the previously reported devices that have used only one of the two vibration modes. The current sensor is intended to mimic the dual optimality of the fly ear that requires proper response contributions from both the rocking and bending modes. The device consists of two clamped circular membranes and a coupling bridge pivoted in the middle, which connects the centers of the two membranes (Figure 2a). This device configuration ensures that the superposition of the out-of-phase and in-phase response components is realized by a mechanical structure itself, which closely resembles the fly ear. In order to achieve the dual optimality at 8 kHz, the structural parameters of the sensor have been chosen so that its rocking and bending natural frequencies are 9.47 kHz and 20.20 kHz (the two red dots in Figure 1d (ii)), respectively. Furthermore, our device represents a binaural hearing device, which makes use of the interaural directional cue of mIPD to determine the sound source azimuth angle. Since mIPD is independent of sound intensity, sound localization can be performed with a single device. In addition, a low-coherence fiber optic interferometer24 (Figure 2b) was used to detect the acoustic-pressure-induced membrane deflection with high sensitivity, high resolution and low noise (see the Supplemental Materials for details on the MEMS device, its fabrication and the measurement system). The fully-assembled device is shown in Figure 2c.

Figure 2
figure 2

Fly-ear inspired sensor.

(a), Cross-sectional view of the sensor, which has four layers: (1) device layer, (2) perforated holes layer, (3) back chamber layer and (4) back plate layer. (b), Low-coherence fiber optic interferometer for detecting membrane vibration. (c), Photo of the assembled prototype shown next to a kitchen match. The length of the scale bar is 2 mm.

To characterize the device performance, the directional cue mIPD was obtained for different sound frequencies and incident azimuths in an anechoic chamber (see Figure 3a(i)). Least-squares fitting of the experimental data was used to obtain the natural frequencies and damping ratios. The rocking mode and bending mode natural frequencies were 9.75 kHz and 22.00 kHz, which are close to the designed values of 9.47 kHz and 20.20 kHz. The experimental results (Figure 3a (i)) were in good agreement with the numerical simulations (Figure 3a(ii)). In addition, the mode shapes were measured using a laser scanning vibrometer (Polytec MSA-500) (see Figure 3b), which confirmed the rocking and bending modes at the designed frequencies.

Figure 3
figure 3

Characterization of the fly-ear inspired sensor.

(a), Phase difference as a function of frequency and incident azimuth: (i) experiments and (ii) simulations. (b), Two vibration modes obtained with a laser scanning vibrometer. (c), Average directional sensitivity (ADS) and nonlinearity (NL) as a function of frequency (circles and squares for experimental results and solid lines for simulation results). (d), Phase difference mIPD as a function of azimuth at the optimal working frequency 8 kHz (red circles for experimental results, green solid lines for simulation results). (e), An example of the bio-inspired localization-lateralization scheme. With an initial azimuth of 80° for the sound source (in the lateralization range), the fly-ear inspired sensor is rotated until the source falls in the linear (localization) range of the sensor, at which a final turn is made to pinpoint the source.

The variations of ADS and NL with respect to frequency were obtained from Figure 3a and are plotted in Figure 3c. The device exhibited dual optimality at 8 kHz, as designed. At this frequency, mIPD was a linear function of θ in the range -30° ≤ θ ≤ 30° (Figure 3d). ADS, the slope of mIPD in this azimuth range, was estimated to be 1.69 deg/deg, which is 10 times the DS of the uncoupled case at the midline (0.17 deg/deg). With a conventional microphone pair, a 10 times increase in directional sensitivity is only obtainable by increasing the separation ten-fold.

The damping level affects the directional sensitivity and robustness to perturbations. The fly ear exhibits better robustness than the MEMS device to perturbation of structural parameters and variation of input sound frequency because of the ear's higher damping. This can be seen by comparing the sharpness of the peak of ADS and the dip of NL obtained for the fly-ear structure (Figure 1c) with that obtained with the low-damping device (Figure 1g). The smoother ADS peak allows the fly ear to achieve a robust localization despite frequency variations in the cricket calling song (4.6 to 5 kHz)10. On the other hand, low damping renders the MEMS device a higher ADS and better frequency selectivity (due to the sharp peak in the ADS spectrum), which can be advantageous in applications that require high directional sensitivity or exceptional frequency selectivity.

To take full advantage of the MEMS device's dual optimality characteristic, we further developed a control scheme that was inspired by the fly's localization-lateralization scheme for pinpointing the sound source. In this fast, simple, but accurate control scheme, as shown in Figure 3e, mIPD as a function θ was approximated by a sigmoid relationship, where −30° ≤ θ ≤ 30° is the linear range. When the sound source is out of the linear range, the sensor is rotated continuously with a constant angle of 20° towards the direction of the source (lateralization) until the source falls in the linear range and then the sensor pinpoints the source by using the estimated source location (localization). With at most four iterations, we demonstrated a localization accuracy better than ±2° (the same as the fly ear (4)) in our indoor laboratory environment. (See the video in the Supplemental Materials .)

Discussion

In summary, our results provide new insight into the fly ear's directional hearing ability and a new paradigm for developing fly-ear inspired sensors. By defining two new performance metrics, ADS and NL, we discovered that the fly ear possesses a unique dual optimality, which indicates that the structural parameters of the fly ear have been optimized (i.e., a proper contribution of rocking and bending modes ensures the right coupling) for localization at the specific frequency of 5 kHz. Furthermore, we showed that this distinguishing dual optimality attribute is replicable in a synthetic device that can be tailored to have any desired working frequency or device size. Finally, we demonstrated for the first time a fly-ear sized device with the same localization accuracy as the fly ear, which has not been achieved with any prior devices of comparable size.

This work enables a new sensing paradigm that will impact many applications requiring miniature acoustic arrays. For example, the sensors can be used for acoustic communication and navigation in micro-air-vehicles (MAVs)25, in which the space to mount the sensors is so confined that only small devices are feasible. These sensors can also lead to promising solutions with reduced size and improved performance for ear canal hearing aid devices22. Furthermore, this new sensing paradigm is promising for underwater sound-source localization, where localization devices must be larger than in air since the speed of sound in water is more than four times faster. We also envision using an array of these tiny sensors tailored to work at different frequencies to cover a wide sound frequency range, achieving broadband sound localization. This will transfigure sound localization systems, which currently rely on large microphone arrays.

Methods

MEMS prototype

The MEMS device has four layers (see Figure 2). Layer 1 consists of two circular polysilicon membranes (diameter of 1.1 mm and thickness of 0.5 μm) and one SiO2/Si3N4 bridge (width of 300 μm and thickness of 3.2 μm). Layer 2 has eight 60 μm diameter perforated holes for damping tuning and one 500 μm diameter hole for optical fiber guiding under each membrane. Layer 3 is for creating a back chamber and Layer 4 is for guiding optical fibers. The four layers are bonded by using a thermoplastic layer deposited on one surface of each mating pair.

MEMS device fabrication

A brief description of the fabrication process is provided here for Layer 1 (see Figure 2). On top of a silicon-on-insulator (SOI) wafer, a photoresist sacrificial layer was deposited and patterned, followed by PECVD of the coupling beam, which consists of alternating layers of SiO2 and Si3N4. The coupling beam was patterned with a second layer of photoresist and etched by reactive ion etching (RIE). A photoresist layer was patterned on the backside of the wafer to define the membrane geometry. Then, the silicon wafer was etched through deep reactive ion etching (DRIE) until reaching the SiO2 etch stop layer. By using the same mask, the SiO2 layer was removed by RIE. The sacrificial photoresist was removed with an isotropic oxygen plasma ash process.

Optical detection system

The low coherence fiber optic interferometer system (see Figure 2) consists of a super-luminescent diode (SLD) (O/E Land Inc, OELED-100), two Fabry-Pérot (FP) sensing interferometers formed between each membrane and the corresponding fiber tip, two FP tunable filters (Micro Optics, FFP-TF2) used as read-out interferometers and two photo-detectors (New Focus, Model 2011). The optical path difference (OPD) of both sensing and read-out interferometers is about 120 μm, which is much longer than the coherence length of the SLD. In order to achieve maximum sensitivity, biases are applied to the tunable filters so that the initial working positions are at quadrature points.