Introduction

The pressure difference across the sensory tissue of the cochlea, the organ of Corti complex (OCC, Fig. 1), produces vibrations that ultimately give rise to the sensation of sound. The OCC motions are boosted by a nonlinear active process that enables sound processing over a broad range of frequencies and intensities1. Somatic motility of the mechanosensory outer hair cell (OHC) is largely accepted as the key mediator of the active cochlear mechanism1,2,3. The electromechanical properties of the OHCs convert electrical energy, stored within a metabolically-maintained resting potential inside the cochlea, into mechanical energy. The active process causes a nonlinear response such that basilar membrane (BM) gain relative to the stapes motion is on the order of ten thousand at low sound pressure levels (SPLs) and on the order of a 100 at high SPLs4,5,6. This nonlinearity compresses the dynamic range by two orders of magnitude, maintaining sub-nanometer sensitivity for threshold-level sounds while protecting the delicate sensory microstructures at high sound pressure levels. The onset of nonlinearity in BM responses measured at mid-to-high characteristic frequency (CF) locations occurs at frequencies about one-half octave below the CF and extends to slightly above CF in most rodents4,6,7,8,9. Experiments in living cochleae6,10,11,12,13, and in vitro14 have demonstrated that OHCs generate forces over a wide frequency range. The nonlinear, frequency-location-specific BM-motion enhancement, sometimes termed the cochlear amplifier, is studied in the current paper.

Figure 1
figure 1

Organ of Corti complex (OCC), refers to the cellular organ of Corti and the acellular tectorial and basilar membranes (TM and BM, respectively).

One motivation for the present work stems from Dong and Olson5, who explored cochlear amplification by measuring sound-evoked electrical and mechanical responses in vivo. A specialized dual-pressure-voltage sensor was used to measure the scala tympani (ST) voltage and acoustic pressure (\(P_{st}\)) simultaneously at the same location close to the BM. Further, pressure differences were used to make an approximate measurement of BM displacement. We will denote the ST voltage measured close to the BM as the local cochlear microphonic (LCM); this potential closely tracks the transducer current flowing through OHCs in the vicinity of the electrode15. The contribution by inner hair cells (IHCs) is expected to be relatively small16.

Figure 2A,B revisits frequency response results from the Dong and Olson study (experiment wg165)5. In this figure, the LCM evoked by different SPL (thin lines) is compared to a pressure-based estimate of the BM displacement (thick lines). Because of the approximate nature of the pressure-based displacement measurement, we also compare the LCM to BM displacement data acquired in a different manner in Fig. 2C,D. Here, the LCM from wg165 is compared to a separate BM displacement data set from the same lab, but from a different animal (GB800), gathered using a laser-based interferometric method (but without simultaneous LCM). The method is described in12,17 and briefly in the “Materials and methods” section. The measurement location for both experiments is similar. The data are shown on a frequency axis that is normalized to the CF of each preparation, 23 kHz for wg165 and 26 kHz for GB800.

The aspect of the data that is of primary interest is the phase of the LCM relative to BM displacement in Fig. 2B,D. The LCM phase rides along with the displacement phase up to a frequency of \(\sim 0.76f_{CF}\), but then shifts upward to lead the BM phase. At a frequency \(\sim 0.9f_{CF}\), the lead of LCM re BM displacement is \(\sim 0.27\) cycle for the simultaneous wg165 LCM-pressure data of Fig. 2B, and \(\sim 0.4\) cycle for the data of Fig. 2D. Comparisons of the LCM phase to either the simultaneously measured wg165 BM displacement phase data or the data from a different animal (GB800) both show that the displacement and LCM phase track closely up to the bifurcation point at \(\sim 0.76f_{CF}\). This consistency confirms that this finding is not related to the non-standard displacement measurement in experiment wg165. Moreover, this correspondence shows that we can use the phase information from one gerbil’s BM displacement as a reference for the LCM of a different animal, provided the CFs are nearby as in Fig. 2. The second interesting aspect of the data is that the magnitude of the LCM exhibits a notch corresponding to the phase shift at \(\sim 0.76f_{CF}\). This is clear at SPLs up to 60 dB SPL, but less so at 70 dB SPL.

The significance of the phase shift was analyzed in5, and shown to correspond to a transition from non-effective forcing by OHC somatic forces below the shift frequency (here \(0.76f_{CF}\)) to effective forcing and power generation in the CF region. This analysis used previously-published relationships between OHC somatic force and OHC voltage14 and between LCM (representing OHC current) and OHC voltage18. The conclusion that above the transition frequency OHC forcing becomes effective on the BM is consistent with the observation that the BM gain curves bifurcated from their sub-CF linear backbone at \(\sim 0.76f_{CF}\). Thus, cochlear amplification, defined as the nonlinear peaking of the BM gain, began at the frequency of the LCM amplitude notch and phase shift relative to BM motion; these two transition frequencies coincided. The existence of a transition frequency that marks the onset of nonlinearity is commonly seen in the BM response to acoustic stimulation. It is seen for example, in gerbils7, guinea pigs8,9 and mice6 (see Table 1). The study from mice6 compared stereocilia pivoting motion to BM transverse motion, and those measurements are particularly relevant to the present study, because stereocilia pivoting is closely related to HC transducer current18.

The physical basis for the amplifier-activating phase shift is the subject of the current paper. We present simulations from a realistic cochlear model to explore the basis for the experimental findings. Many models of the cochlea have been developed. They include lumped element circuit models with and without active elements19,20,21, and detailed finite element method (FEM) models22,23,24,25. Our model couples a 3D FEM-based treatment of the fluid with a circuit model of the sensory tissue. The key feature of our model is that it is fully electromechanically coupled. Mechanical motion activates the mechanoelectric transduction in the OHC hair bundles which gives rise to the current that drives OHC basolateral electromechanical forcing. The amplitude and phase of the OHC activity are not a priori assumed, but rather are an output of the model. Hence, this model enables us to seek a causal relation (inside the model) between output responses and mechanisms. In addition to showing model results, we reinforce the experimental finding by showing new LCM data that confirm the findings of phase shift and amplitude notch, and extend the previous findings into a lower frequency region. Additional supportive experimental evidence has already been published12,26,27. All of these results are from gerbil. A similar LCM amplitude notch and phase shift was observed in 1976 work in guinea pig28 but was not apparent in the more recent work15. Further studies are needed to determine the species generality of the findings.

Figure 2
figure 2

(A, B) Previously published data5 (exp. wg165) in which LCM and displacement were measured using a dual pressure-voltage sensor. All quantities are referenced to EC pressure. (A) Magnitude of LCM (thin) and BM displacement (thick). (B) Phase of LCM (thin) and BM displacement (thick). (C, D) Here LCM from wg165 are plotted with displacement found with a laser-based method (exp. GB800). These panels are included to reinforce the findings of (A, B). (C) Magnitude of LCM (thin) and BM displacement (thick). (D) Phase of LCM (thin) and BM displacement (thick). Positive displacement direction is up in Fig. 1, in the direction of the scala media. (Note: In the simultaneously gathered data of (A) and (B), a peak at \(\sim\) 0.4 CF and dip at \(\sim\) 0.5 CF are apparent in both displacement and LCM. Such a feature is not generally present; it might be due to a sound calibration or middle ear resonance in this preparation).

Table 1 Comparison frequency of the onset of nonlinearity (\(f_{NL}\)) to the CF of the measurement location from different experimental data.

Materials and methods

Experimental measurements of local cochlear microphonic and BM motion

This paper is primarily a modeling paper, with experimental data included to bolster previous experimental findings. The wg165 data of Fig. 2 and the BM motion data of Fig. 429 were previously published. Other data from these figures are unpublished, although similar LCM and OCT-based displacement data have been presented and methods fully described in recent work12,17. To keep the focus on the modeling results, the description of experimental methods for the unpublished data is kept short. Procedures were performed in accordance with the animal use protocol approved by the Columbia University Institutional Animal Care and Use Committee. Young adult gerbils were sedated with ketamine and anesthetized with pentobarbital, with supplemental dosing throughout the experiment and the analgesic buprenorphine was given every 6 h. Animals were euthanized at the end of the experiment. The stimulus generation and acquisition were performed using MATLAB-based programs and a Tucker Davis Technologies (TDT) System. Sound stimulation was generated via an electrically shielded Fostex dynamic speaker, connected in a closed-field configuration to the ear canal (EC). The sound calibration was performed within the EC using a Sokolich ultrasonic probe microphone. Pure tone stimuli were used for LCM measurements and multi-tone stimuli were used for the BM motion measurement of Fig. 2. The pure tone stimuli had a frequency spacing of 500 Hz. The multi-tone stimulus was a Zwuis tone complex composed of 60 frequencies spanning 1–35 kHz. In the Zwuis stimulus type the distinct stimulus frequencies are chosen such that there is no overlap between the second and third order distortion products and the primary frequencies30. The phase response that is the focus of the current paper is not significantly affected by multi-tone versus single-tone stimulation12,30.

Local cochlear microphonic

For LCM measurements, after opening the bulla, a hole of diameter \(\sim \;100\;\upmu\)m was hand-drilled to access ST through the bony wall of the first turn of the gerbil cochlea where the CF was 15–25 kHz. A polymer-coated tungsten electrode (FHC Inc., Bowdoin Maine) with shank diameter 250 \(\upmu\)m and tip diameter \(\sim \;1\;\upmu\)m, held in a motorized micromanipulator (Marzhauser) was inserted into the hole and used to measure voltage responses to acoustic stimuli. (In5 the voltage sensor was an insulated wire electrode adhered to the side of a pressure sensor.) The impedance of the electrode was 1–5 M \(\Omega\) when measured at 500 Hz. The metal electrode had a broad-band frequency response, and thus no correction due to low-pass filtering by the electrode was needed5,31. The voltage was amplified \(\times \;500\) or 1000 by a PARC \(EG \& G\) amplifier. A reference electrode was placed on the muscle at the neck. Once within the cochlea the electrode was advanced in steps toward the BM and the responses to acoustic stimulation measured. When traveling wave responses were detected through several cycles, the measurement was deemed “local”.

BM motion

A commercial ThorLabs Telesto III spectral domain optical coherence tomography (SD-OCT) system was used to measure the vibrations of the BM through the intact round window membrane. OCT-based measurements are a laser-based motion measurement system and the measured displacement is based on the strict physical quantity of light wavelength. In Fig.2C,D we included one data set taken with the Telesto, to support the results of Fig.2A,B, in which the displacement was measured with a less stringent method, via fluid pressure differences. Data acquisition and analysis scripts were written in Matlab (R2016b) and in C++, based on the Thorlabs Software Development Kit. To make a measurement with the Telesto, first a two-dimensional scan, termed a B-scan, was taken across the radial direction of the OCC to image a radial section of the BM and OC (as in the schematic of Fig. 1). Then scanning was arrested so that the OCT collected data along one axial line, termed an A-scan. In the schematic of Fig. 1, the A-scan would be approximately vertical, running through the BM, OC, TM at one radial location12,17. We then acoustically stimulated the ear with a multi-tone stimulus, while acquiring a series of A-scans at a sample rate of \(\sim 100\) kHz. Selected locations in the A-scan were chosen for extraction of displacement vs. time. Locations along the A-scan include structures within the OC, but for the purposes of the the present paper, the motion at an A-scan pixel corresponding to the BM was shown in Fig. 2.

Combining LCM and BM motion from different experiments

In Figs. 2C,D and 4, LCM and BM motion responses from different animals are compared. In order to make the comparison, measurements with reasonably similar CFs were paired and the frequency axis was normalized by CF. A BM motion data set with BF of 26 kHz was used for Fig. 2C,D and a single BM motion data set, with BF of 15.5 kHz was used for all the comparisons in Fig. 4. The f/CF normalization is the only manipulation required for Fig. 2C,D. For the results of Fig. 4, the approximately CF-matched results came from BM motion presented in29. In that data set BM velocity was measured relative to stapes, and a 25 \(\upmu\)s middle ear delay32 and 0.25 phase shift is applied to convert the data to displacement relative to EC pressure.

Mathematical model

To model the LCM and BM motion in response to acoustic stimuli, we made minor modifications to an existing physiologically-based three dimensional mathematical formulation (originally used to model the guinea pig cochlea)33,34. This model can be considered a generic model for studying active mechanisms in mammals, as it has been used to investigate different animal models including genetically manipulated mice34,35 and both in vitro25 and in vivo36,37 gerbil preparations.

Next, we briefly recount the main features of the model as described in previous papers33,34. Figure 3 shows a schematic of the cochlear box model (panel A) along with the OCC components (panels B and C). In Fig. 3A, the macroscopic fluid-structure configuration is shown. The scala vestibuli (SV) and scala media (SM) fluids are taken together for fluid-mechanical purposes. Electrical cables are present in each of the fluid scala (SV, SM, and ST), and are used to model the ionic current flow in each (see Fig. S1-A in the Supplemental Information and references33,38). Figure 3B shows a cross-sectional view of the OCC and surrounding cochlear fluids. The difference in the ST and SV pressure across the BM causes the OCC to vibrate in response to excitation at the stapes. Structural longitudinal coupling is included in the BM and TM mechanics34. The fluid is modeled as inviscid and incompressible22,34, except in the subtectorial space where viscosity is incorporated through fluid shearing between the TM and the reticular lamina (RL). In addition, a small amount of structural damping of the BM and TM is also included. As schematically shown in Fig. 3B, the microstructural components of the OCC are coupled through forces and kinematic constraints. The TM is anchored to the spiral limbus via a spring with stiffness \(k_{tms}\) and connected to the hair bundles (HB) of the OHCs through a stiffness \(k_{hb}\) (shown as a spiral spring at the base of the HB in Fig. 3B). The three primary independent structural variables shown in Fig. 3B are the displacement of the BM (\(u_{bm}\)), the shear displacement (\(u_{tms}\)) and the transverse displacement (\(u_{tmb}\)) of the TM. The other structural displacements are related to the primary variables through kinematic constraints33. We used a Lagrangian approach39 to obtain the Euler–Lagrange equations as a function of the primary structural variables as well as the coupling to fluid and electrical variables. This procedure results in the coupled equations governing the OCC motion33.

Figure 3
figure 3

(A) The cochlear box model. For visual clarity the organ of Corti is only pictured at one cross section. These structures are distributed down the length of the cochlea. (B) A schematic of the transverse section of the OCC microstructure. The hair bundles of the OHCs are shown connecting the RL to the TM through torsional springs (spring constant \(k_{hb}\)). The TM is anchored to the spiral limbus via dashpot and spring \(k_{tms}\). As outlined in the text, we used a Lagrangian formulation to obtain the coupled equations of motion relating the electrical and mechanical degrees of freedom of OCC to the fluid pressure. (C) Schematic of an OHC and the mechano-electric-transducer (MET) apparatus. The OCC cross-sections are coupled mechanically through longitudinal coupling in the TM and BM, the 3D-fluid pressure, and through electrical coupling (the three cable model in the scala). (TM tectorial membrane, OHC outer hair cell, RL reticular lamina, BM basilar membrane).

Figure 3C shows the local electrical circuitry of the OHC and the coupling of the tip displacement the hair bundle (HB) relative to the RL and somatic strain to the electrical domain. The deflection of the HBs triggers the opening of the MET channels resulting in current flow, \(I_{hb}\), into the OHC. This nonlinear process has been linearized as in40:

$$\begin{aligned} I_{hb}= \mu {\Delta V}^{0}G^{max}u_{hb}, \end{aligned}$$
(1)

where \(G^{max}\) is the maximum saturating conductance of the HB and \(\Delta V^0\) is the resting value of the voltage difference between scala media (SM) and intracellular OHC potential. The MET scaling factor, \(\mu\), controls the sensitivity of the MET channels; it varies from 0, representing a nearly passive model, to 1, fully active (on the stability boundary). The model is quasi-linear in that varying \(\mu\) simulates the SPL-dependent saturating nonlinearity of the MET channels. This quasi-linear approximation has been shown to be a good approximation with pure tone acoustic input40. The OHC HB tip displacement relative to the RL (\(u_{hb}\)) is a linear combination of the BM transverse (\(u_{bm}\)), TM shear (\(u_{tms}\)) and TM bending (\(u_{tmb}\)) displacements (see33 and Fig. 3.)

Following references33,41,42,43 the linearized, coupled electromechanical relations governing the total axial compressive force (\(F_{ohc}\)) and transmembrane current (\(I_{ohc}\)) of the OHC are given by

$$\begin{aligned} {\begin{matrix} &{}F_{ohc}=K_{ohc} u_{ohc}^{comp}+\varepsilon _3(\phi _{ohc}-\phi _{st}), \\ &{}I_{ohc}=(\phi _{ohc}-\phi _{st})/Z_m- \varepsilon _3 \frac{\text {d}u_{ohc}^{comp}}{\text {d}t}, \end{matrix}} \end{aligned}$$
(2)

where \(u_{ohc}^{comp}\) is the OHC compression (the difference between the displacement of the apical and basal poles of the OHC), (\(\phi _{ohc} - \phi _{st}\)) is the transmembrane potential (the difference between the intracellular potential \(\phi _{ohc}\) and extracellular potential \(\phi _{st}\)). The transduction current, \(I_{hb}\), activates the OHC somatic motility, which applies a mechanical force on the BM and RL. The OHC somatic force, \(F_{som}\), is proportional to the product of the OHC transmembrane potential and the piezoelectric coupling coefficient, \(\varepsilon _3\): \(F_{som}=\varepsilon _3(\phi _{ohc}-\phi _{st}\)). This force has been shown to hold sufficient authority to deform the local OHC42 as well as neighboring OHCs44 and this coupling is represented in the model. In Eq. 2, \(Z_m\) is the OHC basolateral electrical impedance (see Fig. 3) and \(K_{ohc}\) represents the OHC stiffness. \(I_{pz}=-\varepsilon _3\frac{\text {d}u_{ohc}^{comp}}{\text {d}t}\) corresponds to the total current due to the piezoelectric-like behavior of the OHC. Hence, the mechanical degrees of freedom give rise to electrical currents (i.e., \(I_{hb}\) and \(I_{pz}\) are due to mechanical motion) and electrical degrees of freedom give rise to mechanical forcing (\(F_{som}\)). The mathematical model outlined above gives rise to a set of coupled partial differential equations, which are solved using the finite element method45. The parameters of the model follow35 with the exception of those listed in Table 2. We also study the impact of manipulating the damping coefficients in this paper. The minor modification of the parameters listed in Table 2 provided a better match to the voltage magnitude notch in the experimental data but did not significantly affect the change in the phase between voltage and BM motion (as demonstrated in Fig. 9B).

Table 2 Parameter values that are changed from35.

Results

Experimental LCM and BM displacement phase comparison

Figure 4 shows the frequency response of the tone-evoked LCM over a range of SPLs for six preparations, with the results from each preparation comprising three vertically-stacked panels. For example, Fig. 4A–C are from expt. 670, with (A) the unnormalized amplitudes, (B) the amplitudes normalized to the EC pressure and (C) the phase relative to the EC pressure. In (B) and (C) the abscissa is normalized to the peak frequency of the lowest SPL response, the CF. The low and moderate SPL responses show several cycles of traveling wave phase accumulation, indicating that the voltage responses were dominated by local OHCs5. The CFs range from 14.5 to 18.5 kHz. In the phase panels, we also present the BM displacement phase responses from29 at three different SPLs as thin dashed black lines; this phase is seen not to vary significantly with the SPL. The same BM displacement phase is shown in each phase panel, and this enables a common reference for the comparison LCM phase to the BM displacement phase similar to that of Fig. 2. The CF of the BM displacement data was 15.5 kHz, close to the CF of the LCM measurements.

Figure 4
figure 4

Frequency response of LCM in six gerbil preparations. Each preparation is represented in three vertical panels. The color key to SPL is in the top panel, the CF and animal number is noted in the bottom panel. (A,D,G,J,M,P) Response magnitude. Horizontal dashed line is the noise floor. (B,E,H,K,N,Q) Magnitude normalized to EC pressure. (C,F,I,L,O,R) Phase referenced to EC pressure. In (B,E,H,K,N,Q,C,F,I,L,O,R) the frequency axis is normalized to CF. To provide a comparison similar to that of Fig. 2, in (C,F,I,L,O,R) phase data from BM motion responses from29 are included in thin dashed black lines. The CF of the BM motion data was 15.5 kHz. 30, 40 and 50 dB SPL results are included to indicate SPL was not critical to this comparison.

The features we are most interested in are the magnitude notch (local minimum) in LCM and the concomitant shift of the phase difference between LCM and BM displacement. The magnitude notch is clear in five of the seven LCM experiments presented in this paper (Expts. 165, 627, 660, 693, 696) and mild in two others (Expts. 678 and 693). The location of the amplitude notch does not change with SPL but its depth varies and sometimes disappears at higher SPL values, as noted when describing Fig. 2. At the same frequency as the LCM notch, a shift of the LCM phase with respect to the BM displacement phase is apparent in all the experiments. We denote this shift frequency, \(f_{shift}\), using a dashed vertical line in the panels of Fig. 4. For some experimental results and specific SPL values, an unwrapping ambiguity occurred close to \(f_{shift}\) (panels F and R). At frequencies above these ambiguities, the phases at different SPL become offset by approximately a full cycle. The figures show the data with standard unwrapping, but full cycle phase corrections are applied before finding the average phase lead (in the next paragraph). That is because when averaging a set of phase values for the purpose of understanding the amount of sub-full-cycle lag or lead, one needs to average the values that are closest after full cycle offsets are removed.

Including data from Fig. 2 and that from Fig. 4, we have an N of 8 to consider in order to extract two metrics from these data. (1) The phase transition occurs at a frequency relative to CF of 0.77 ± 0.04 (mean ± standard deviation). (2) Evaluated at \(0.9 f_{CF}\), the phase lead of LCM relative to BM displacement is \(0.37 \pm 0.09\) cycles. The transition frequency, \(f_{shift}\), divides the (nearly) linear sub-CF region and the peaked and nonlinear CF region. At SPLs 70 dB and above, nonlinearity began to extend into the sub-CF region, likely due to saturation of OHC current at relatively high SPL. As an aside, recent measurements of motions within the OCC in the base of the gerbil cochlea show that nonlinearity exists in the motions of intra-OCC structures at sub-CF frequencies10,11,12. The sub-CF nonlinearity observed in the motion is similar to our observations of high SPL nonlinearity in sub-CF LCM12, and indicates that the OHC electromotile force is present for all frequencies, but amplifies BM motion only in the peak region.

To summarize, the experimental results of Figs. 2 and 4 reinforce the findings of5 in showing that at the frequency where the BM nonlinearity begins, a phase shift of OHC voltage relative to BM displacement occurs, “activating” the cochlear amplifier. In the simulations below we show how this activation of the cochlear amplifier could occur.

Simulations

To interpret the experimental results, we created an analogous set of predictions using the cochlear model described in the “Materials and methods” section. In Fig. 3B, a schematic of the cochlear cross section is shown along with the definitions of the directions of positive displacement for the BM (\(u_{bm}\)) and the TM in the shear (\(u_{tms}\)) and bending (\(u_{tmb}\)) directions. We also compute the hair bundle deflection (\(u_{hb}\)) that divided by the HB height equals the pivoting of the HB relative to the cuticular plate (see Fig. 3C); this motion gives rise to the MET currents (\(I_{hb}\) in Eq. 1), and is a linear combination of \(u_{bm}\), \(u_{tmb}\) , and \(u_{tms}\)33. The theoretical prediction of the LCM is written as \(\phi _{st}\). \(\phi _{st}\) is strongly correlated to OHC current, thus the amplitude and phase of \(\phi _{st}\) is expected to be similar to \(u_{hb}\). Deviations from similarity arise due to current spread from adjacent locations, which is included via the electrical cable model of the fluid spaces. The effect of current spread was studied previously5, and is explored further in the Supplemental Information of the present paper. In Fig. 5, predictions of the ST voltage (\(\phi _{st}\), see Fig. 3C) and mechanical responses (\(u_{bm}\), \(u_{hb}\), and \(u_{tms}\)) to acoustical stimulation are shown for a basal region of the model (4 mm from the stapes). Frequencies in this plot are normalized to the BM peak frequency at low SPLs (CF), and amplitudes are presented relative to the stapes displacement. As in the experiment, a local minimum is seen in \(\phi _{st}\) at a frequency below the CF, \(f^{m}_{shift}\) = 0.78 CF (the model-shift frequency), and the BM response does not evince a notch (consistent with the experimental results). From the phase plot, it can be seen that the phase of \(\phi _{st}\) and \(u_{hb}\) deviate from \(u_{bm}\) at a frequency slightly less than that of the notch. Hence, the phenomenon of a notch and phase-shift frequency is predicted by the model.

To determine if these relationships are level dependent, the model predictions of the magnitude and phase of \(\phi _{st}\) and \(u_{bm}\) gain (relative to the stapes displacement) are shown in Fig. 6 for a range of input SPL. Increasing SPL was simulated in our model by decreasing the MET sensitivity as embodied by the scaling factor, \(\mu\) (see Eq. 1). Fig. 6A shows that the model-predicted voltage notch is present for all SPLs, unlike the experimental result where the notch is washed out for stimulus levels above \(\sim\) 60 dB SPL. Hence, there are stimulus level variations in the cochlea that are not encompassed by simply varying \(\mu\). As in the experimental LCM results, the \(\phi _{st}\) gain decreases with increasing SPL. At low frequencies nonlinearity is more pronounced in the model than in these single-tone experimental data, but in experiments with multi-tone stimuli12, low frequency nonlinearity is strong. Hence, this difference between experiments and model results at high SPL is considered a fairly minor quantitative difference that does not significantly impact interpretation of mechanisms influencing low-level active processes. Strong nonlinearity is seen over all stimulus levels near CF in the model, as in the experiments. Fig. 6B shows amplitude and phase of the BM displacement as a function of SPL and frequency. As in the experiment (see Fig. 2B), the BM nonlinearity emerges strongly at a frequency near the notch frequency.

Figure 5
figure 5

Model predictions of the transfer function for ST voltage (\(\phi _{st}\)), BM transverse (\(u_{bm}\)), TM shear (\(u_{tms}\)) and HB deflection (\(u_{hb}\)) displacements relative to the stapes displacement; frequencies are normalized to the CF = 14.8 kHz, the MET scaling factor is set to \(\mu =0.7\), and the stapes positive motion convention is outward. The units for the ST voltage to stapes motion gain is in mV/nm in order to scale with the non-dimensional displacement gains.

Figure 6
figure 6

Theoretical predictions of the magnitude and phase of (A) the amplitude of the transfer function between \(\phi _{st}\) and the stapes motion and (B) the BM displacement gain (relative to the stapes displacement) as a function of SPL and frequency (normalized to the CF = 15 kHz). Variation of the input SPL on the gain is modeled by altering the MET sensitivity controlled by the parameter \(\mu\) defined in Eq. 1. The voltage responses show the notch at 0.76 CF kHz while the onset of the BM nonlinearity occurs at slightly higher frequencies close to 0.82 CF (where low \(\mu\) corresponds to a higher SPL as described in the “Materials and methods”). The frequency of the notch is relatively independent of SPL. This is consistent with the measurements up to 70 dB SPL in Fig. 4 and 85 dB in Fig. 2B.

In Fig. 7 we plot the measured (from5, redrawn in our Fig. 2) and predicted phase difference between \(\phi _{st}\) and \(u_{bm}\). The measured phase difference (dashed line in Fig. 7) underwent a phase shift at the transition frequency (0.76 CF) that produced a \(\sim 0.34\) cycle shift, when measured at 0.9 CF. The phase shift results (transition frequency and phase shift magnitude) from Fig. 4 were similar, as reported above. The theoretical prediction showed a phase shift with very similar onset and slope to the experimental value. The value of the phase shift at 0.9 CF was \(\sim 0.25\) cycle, slightly smaller than the experimental value of \(0.37 \pm 0.09\) cycle. Above the CF the phase difference in the both experimental and modeling responses underwent more extreme variations that are associated with the onset of the supra-CF phase plateau.

Figure 7
figure 7

SPL dependence of the phase transition between the ST voltage and BM displacement. The SPL variation in the model is simulated by changing the MET scaling factor \(\mu\). The experimental value for 60 dB SPL (from5) superimposed as a dashed line.

Discussion

TM-radial dynamics initiates the cochlear amplifier in the mathematical model

The notch in the amplitude of the voltage response signals both a change in the voltage phase relative to the BM motion and the onset of nonlinearity in the amplitude of the BM displacement. We denote the model predictions of the LCM as \(\phi _{st}\). In our mathematical model, the notch in \(\phi _{st}\) corresponds to a notch in the HB pivoting relative to the RL (quantified by \(u_{hb}\), see Fig. 3). According to Eq. 1 in the “Materials and methods” section, the MET current will follow \(u_{hb}\). The amplitude of \(\phi _{st}\) in the model is mainly due to the local MET current flowing through the resistance of the cochlear fluids, and, like \(u_{hb}\), \(\phi _{st}\) also displays a minimum. Both the predictions and experiments show an SPL-independent frequency where the phase shift occurs, suggesting that a passive mechanism is responsible for this feature. In the mathematical model, we are able to explicitly identify this mechanism. The shift frequency occurs at the TM radial resonance frequency, determined by a combination of the TM mass and the limbal attachment stiffness. This frequency is given by \(\frac{1}{2\pi }\sqrt{k_{tms}/M_{tms}}\) where \(k_{tms}\) is the stiffness of the attachment of the TM to the spiral limbus and \(M_{tms}\) is the TM shear mass (Fig. 3B). We determined causality in the model in two ways. First, we directly computed the \(\frac{1}{2\pi }\sqrt{k_{tms}/M_{tms}}\) and found it was equal to \(f^m_{shift}\). Second, we manipulated the limbal attachment stiffness in the three-dimensional finite element model and were able to predictably adjust the notch frequency according to the resonance calculation above.

We used the numerical model to compute the mechanical power that the OHC somatic electromotile force injects (amplifies) or removes (dissipates) at the BM34. The energy converting electromotile force is proportional to the cells’ transmembrane potential46 as described by Eq. 2. Power depends on the relative phase between the force and the motion as well as the magnitude of each. Figure 8 shows the model prediction of this power (\(\Pi _{ohc/bm}\)) for high-level sound input (using \(\mu = 0.1\)) and low-level sound input (using \(\mu = 0.5\)). A positive value for \(\Pi _{ohc/bm}\) indicates power injection (green) from the OHC electromotile force to the BM while negative \(\Pi _{ohc/bm}\) represents dissipation (red). The vertical dashed line at \(f_{shift}^m\) in Fig. 8 is seen to define the boundary between dissipation and amplification. Figure 8 shows that the OHC power amplification/dissipation normalized to input power is level dependent and thus nonlinear. The level-dependence is greater above the notch frequency (Fig. 5B) where the OHC power is positive. The frequency above which power becomes positive is the same for both low and high SPLs. The injected power is predicted to peak before the CF, a result consistent with the focused photoinactivation of OHC motlity47 and recent two-tone suppression analysis of the RL and BM vibrations48. In the model, the phase difference between the somatic force (proportional to the OHC transmembrane potential) and BM displacement reaches almost 0.25 cycle at this frequency. Hence, the active force is nearly in phase with the BM velocity, a condition required for the most effective power injection on the BM motion. In addition, the amplitude of the MET current (driven by \(u_{hb}\)) is increasing from its local minimum at \(f_{shift}\) and the BM displacement amplitude is also increasing, achieving its maximum at CF. Therefore, the OHCs at a more basal region amplify the CF response of a more apical location, boosting the wave as it passes. Finally, our model predicts above-CF dissipation (the red region) for low-level sound, indicating the possibility that the active process helps to stabilize the system through dissipation at higher frequencies.

Figure 8
figure 8

Model prediction of the power exchange between a single OHC and the BM for different activity levels as controlled by the MET scaling factor, \(\mu\), which is set to 0.1 (for high SPL simulation) and 0.5 (for low SPL simulation). The OHC active power deposited on the BM is calculated as \(\Pi _{ohc/bm}=1/2Re \{ F_{som}v_{bm}^* \}\), where * represents the complex conjugate and \(v_{bm}\) is the velocity of the BM. The OHC power is normalized to the stapes input power. Negative power indicates power dissipation while positive values denote power generative region. The OHC active force is introduced in Eq. 2. The vertical dashed red line is drawn at \(f_{shift}^m\) which corresponds to the frequency boundary where the power deposition changes from dissipative (red) to generative (green). Frequencies are normalized to the CF = 15 kHz and powers are normalized to the stapes input power (\(\Pi _{stapes}\)).

The presence of the notch is damping dependent, but the phase shift is not

The level of damping due to the shear motion of the RL and TM plays a subtle but key role in shaping the magnitude and phase of the frequency response. In Fig. 9A, we investigate the effect of varying the damping in the subtectorial space on \(\phi _{st}\). This damping is controlled by \(C_{tms}\), the shear damping of the TM (see Table 2). Increasing the shear-damping factor leads to the elimination of the notch in the magnitude spectrum and reduction of the slope of the overall phase change, but not the amount of the cumulative phase change (Fig. 9B). In Fig. 9B, we plot the difference between the phase of \(\phi _{st}\) and \(u_{bm}\) as a function of frequency for different damping values. While the details of the phase change differ, predictions from all three levels show an overall \(\sim 0.25\) cycle phase increase in the transition from the notch frequency to CF. Because of this sensitivity to damping, the model predicts animal-to-animal as well as species-to-species variability of the depth (or even existence) of the notch in the magnitude spectrum but not of the overall phase change, which is predicted to be present even in the face of these damping-dependent differences.

Figure 9
figure 9

The effect of altering the shear damping between the TM and RL is studied. (A) The frequency dependence of the amplitude and phase of the ST voltage is plotted for varying shear damping in the cochlear model. Increasing the TM shear damping factor (\(C_{tms}\)), decreases the depth of the notch (shown by the dashed line) of the ST voltage. (B) The relative phase between the ST voltage and the BM displacement. Increasing shear damping decreases the transitional slope but not overall phase change (CF = 15.3 kHz).

The limbal attachment of the TM is not necessary for near-CF Amplification

In Lukashkin et al.49 experiments were performed involving the \(Otoa^{EGFP/EGFP}\) mouse, a mutant with a TM detached from the limbus but attached to the OHC HB. One of the main and perhaps most surprising results of this experimental study was that the near-CF BM amplification in the mutant was nearly the same in the wild type. Using a model upon which the simulations of the present paper are also based, Meaud and Grosh35 replicated this experimental result. In the model, the near-CF amplification is nearly the same in both mutant and wild type mice because the TM load on the OHC HB is mostly inertial at CF in both animals. In the mutant, the loading on the OHC HB is only viscous and inertial (dominated by inertial forces at CF). In the wild type model (with the TM attached, as in the gerbils studied in this paper) the TM loads the OHC HB with stiffness, viscous, and inertial forces; at low frequencies the load is stiffness dominated while near CF (above the resonance of the TM and its limbal attachment) the load is inertial. We emphasize that this phase change is measurable in both the mechanical behavior and in the LCM; a point that is important for understanding the influence of OHC motility on amplification5.

The TM-mechanics-induced phase-shifting mechanism desribed here is in line with analysis of experimental results from Gummer et al.50, which noted that a notch frequency coincides with the transition of the shear load from the TM onto the OHC HBs from spring-like to inertia-like, with an attendant phase shift that they argued was conducive for amplification. Lukashkin et al.51 also conjectured that a sub-CF TM resonance is critical for the timing of the active process.

A simple mechanical example of a system exhibiting a notch and phase shift

We consider the forced response of a two degree-of-freedom mechanical system pictured in in Fig. 10. The two masses (\(m_1\) and \(m_2\)) are connected through elastic and dissipative mechanical elements [springs (\(k_1\), \(k_2\)) and dampers (\(c_1\), \(c_2\))]. The difference between \(x_1\) and \(x_2\) is analogous to the HB pivoting in the OCC; the analogy is, of course, incomplete because the OCC system has three mechanical and three electrical degrees of freedom. The analytical solution for the steady-state amplitude of the difference of the mass displacements is given in Eq. 3 (neglecting damping terms for the sake of simplicity)

$$\begin{aligned} X_1-X_2 =\frac{F_0(k_1-m_1\omega ^2)}{m_1m_2\omega ^4-k_2m_1\omega ^2-m_2k_1\omega ^2+k_1k_2-m_2k_2\omega ^2}. \end{aligned}$$
(3)

Figure 10B shows the amplitude and phase of the frequency response of \(X_1\), \(X_2\), and \(X_1-X_2\) for values of the mechanical elements denoted in the caption. The difference goes to zero for the undamped system (evident from the numerator of Eq. 3) and attains a minimum for the damped case (see the red curve in Fig. 10B) at the resonance of the mass \(m_1\) when uncoupled from \(m_2\) (\(f_1=\frac{1}{2\pi }\sqrt{k_1/m_1}\)). The physical reason for this is that when the unanchored mass \(m_2\) (BM) is forced at frequency \(f_1\), it does not “feel” a significant load from \(m_1\) (since the impedance looking into \(m_1\) arises only from the damper). Thus the coupling spring between the masses is nearly undeformed, and the two masses vibrate with similar amplitude. As forcing frequency increases beyond the resonance of the uncoupled TM, the phase of the displacement difference undergoes a \(180^{\circ }\) shift because below this frequency the \(|X_2|\) amplitude is greater than \(|X_1|\), and above it the \(|X_1|>|X_1|\). The resonance of the foundation-supported mass \(m_1\) when uncoupled from \(m_2\) fixes the frequency where this crossover occurs, and damping controls the smoothness of the phase transitions and depth of the notch. Though the OCC is more complicated than the two degree-of-freedom system, the same physical principle applies, in that passive mechanical features (the attachment stiffness and radial TM mass) fixes the notch frequency and the center of the phase change.

Figure 10
figure 10

A simple two degree-of-freedom mechanical model exhibiting a notch and phase shift is pictured here. (A) System schematic composed of two connected masses (m), springs (k), and dampers (c). The attachment stiffness of this system (\(k_1\)) is roughly analogous to the limbal TM attachment stiffness in the full six-degree-of-freedom OCC model from Fig. 3. (B) Frequency responses of the system shown in (B) for parameters: \(m_1=1\), \(m_2=2\), \(k_1=4\), \(k_2=3\), \(c_1=0.1\), \(c_2=0.1\) (the parameter values are chosen for demonstration). The difference, \(x_1-x_2\) loosely represents the rotation of the HB in the full OCC system. At the spring/mass resonance frequency of mass \(m_1\) when uncoupled from \(m_2\) (when \(\omega = \sqrt{k_1/m_1} = 2\)) a phase shift and a local minimum (notch) in \(x_1-x_2\) occurs.

Relation to previous studies

There is one piece of direct in vivo evidence that the TM radial resonance, when the TM is uncoupled from the organ of Corti, is roughly one-half octave below the CF of the location along the cochlear spiral. In Lee et al.6 measurements were made in a living \(^{C1509G/C1509G}\) mutant mouse (whose TM is detached from the OHC HBs but attached to the limbus). As discussed in the text of that paper regarding Fig. 9I,J, there was a “dramatic” shift from phasic to antiphasic motion of the radial motion of the TM relative to the transverse motion of the BM. This is most clearly seen in the movies in the lower two panels of Movie 1 from Lee et al.6, where the radial motion of the TM can be seen to switch from being in phase with the BM motion at 5 kHz to being out of phase at 8 kHz, indicating the passage through a resonance. Since the CF of their measurements was 10 kHz, the hypothesis put forth in the present paper would predict the radial resonance to occur one-half octave below this frequency, at \(\sim 7\) kHz, which is consistent with the paper’s experimental result. While the experimental results of that study6 did not show a notch in the HB rotation amplitude spectrum of the wild type animal, a clear 0.3–0.5 cycle phase change was seen as the stimulus frequency increased from one-half octave below CF to CF (their Fig. 5F). This phase transition was observed in both active and passive preparations, which is consistent with our hypothesis that a passive mechanism underlies this process.

There is substantial indirect evidence of a notch in sensory and neural responses of the cochlea. Lukashkin et al.51 discussed several studies that showed a below-CF notch in auditory afferent nerve fiber (ANF) data. ANF measurements in cat52,53, mouse54 and chinchilla55 all showed a local increase in the threshold (corresponding to a notch in sensitivity) at a frequency approximately one half octave below the CF of the fiber. Measurements of the hair cell voltage responses also have shown a notch below the CF. For instance, the OHC/IHC voltage measurements by Kossl and Russell56 (Fig. 9) show a null near 10 kHz for a cell with CF of 17 kHz. In another paper from the same group57 (their Fig. 4), a peak in the threshold pressure needed to attain a given OHC voltage is seen at a frequency of 0.6 CF.

In studies in guinea pig in which voltage was measured at the BM and within the OC following measurements of BM motion in the same animal15, the voltage slightly lagged BM displacement (taking positive displacement towards BM from ST, as in5). The voltage responses did not undergo a phase transition at the frequency where BM responses became nonlinear, although at supra-BF frequencies large phase differences were observed, similar to Fig. 7. The difference in the sub-CF results presented in the current work compared to15 might be related to the fact that the two studies used different animals (gerbil versus guinea pig), electrodes (wide-frequency-band metal versus low-pass-frequency glass), and measurement protocols (simultaneous versus sequential). It will be informative to perform further experiments in guinea pigs and other animals, to specifically explore if this important phase transition that exists in gerbils is present in other mammals.

Theories emphasizing the importance of the TM in cochlear mechanics have a long history. For instance, Zwislocki and Kletsky19 theorized that coupling the TM to the BM via a nonlinear spring could produce the sharpness and nonlinearity that could not be achieved with BM mechanics alone. Our modeling results are consistent with ideas put forth by Zwislocki58. In that study, the relative contribution of the TM to the RL motion was manipulated to create notches in the response (Fig. 17 of that paper shows a notch along with a phase jump at a frequency below the CF). However, the OHC active process was not included, and the significance of the notch and the phase jump was not discussed with respect to cochlear amplification. Also in 1980, Allen20 noted that the zero of a transfer function (which is fixed by the radial resonance of the TM and its limbal attachment) would explain the \(180^{\circ }\) phase shift observed in some neural data. The work of Allen20 and Zwislocki19 predated the discovery of OHC electromotility59. Hence, their work did not address the question of OHC electromotile force generation and power transfer, a key difference between these models and our model.

An early transmission line model of active cochlear mechanics21 used two coupled masses to model the OCC and predicted responses that contained a significant sub-CF notch and concurrent phase ripple/shift. The notch and shift occurred in both BM displacement and in hair bundle displacement, and were not present in passive mechanical responses, so are qualitatively dissimilar to our experimental and modeling results, but share basic behavior with our results. The model of Neely and Kim21 was one of the first models to include OHC electromotility and discuss power gain. In addition to our previous work33,34,35, the importance of appropriate loading of the OHC HBs has been emphasized in the work of Mammano and Nobili60,61 as well as in that of Liu et al.22 Finally, another conceptualization of cochlear mechanics to achieve spatial filtering of spectral signals uses multiple coupled waveguides consisting of structural and fluidic elements. Models based on waveguides coupled through either passive or active elements with carefully tuned properties have been shown to achieve the selective spatial filtering of the frequency content signal mimicking that seen in the cochlea62,63,64,65,66.

Summary

In this paper, we provide experimental and theoretical evidence supporting the role of the TM as the controlling factor for activation of the cochlear amplifier. Input sound pressure creates a travelling wave along the cochlea which generates vibrations on the OCC components (e.g. BM, TM, HB). Deflection of the HBs induces transduction current through the MET channels giving rise to a transmembrane potential across the basolateral membrane of the OHC. The transmembrane potential causes an active somatic force which is then applied to the BM. The effectiveness of this applied force in amplifying the mechanical motions (while taking energy from the electrical domain) relies on the phase relation between the BM and transmembrane voltage of the OHC. We have shown that the passive mechanics of the TM can set the conditions necessary for amplification.