Can Envelope Following Responses Be Used to Estimate Level Compression in the Auditory System?

The ability to compress the large level range of incoming sounds into a smaller range of vibration amplitudes on the basilar membrane (BM) is an important property of the healthy auditory system. Sensorineural hearing impairment typically leads to a decrease in sensitivity to sound and a reduction of the amount of compression observed in BM input-output functions. While sensitivity loss can be measured efficiently via audiometry, no measure has yet been provided that represents fast and reliable compression estimates in the individual listener. In the present study, magnitude-level functions obtained from envelope following responses (EFR) to four simultaneously presented amplitude modulated tones were measured in normal-hearing (NH) and sensorineural hearing-impaired (HI) listeners. The slope of part of the EFR magnitude-level function was used to estimate level compression. The median values of the compression estimates in the group of NH listeners were found to be consistent with previously reported group-averaged compression estimates based on psychoacoustical measures and group-averaged distortion-product otoacoustic emission magnitude-level functions in human listeners, and similar to BM compression values measured invasively in non-human mammals. The EFR magnitude-level functions for the HI listeners were less compressive than those for the NH listeners, consistent with a reduction of BM compression in HI listeners. A computer model of the auditory nerve (AN) was used to simulate EFR magnitude-level functions at the level of the AN. The recorded EFRs at the chosen amplitude modulation rates (81-98 Hz) were considered to represent neural activity originating from the auditory brainstem and midbrain rather than a direct measure of AN activity. Nonetheless, the AN model simulations could account for the main trends observed in the recorded data. The model simulations suggested that the growth of the EFR magnitude-level function is highly influenced by contributions from off-frequency neural populations, which compromises the possibility to estimate local (i.e., frequency specific) compression with EFRs. Furthermore, the model showed that while the slope of the EFR magnitude-level function is sensitive to the loss of compression observed in HI listeners due to outer hair cell dysfunction, it is also sensitive to inner hair cell dysfunction. Overall, it is concluded that EFR magnitude-level functions do not represent frequency specific level compression in the auditory system.


Introduction
An important characteristic of the healthy mammalian auditory system is the compressive transformation of the large dynamic range of input sound pressure levels to a narrower range of levels that can be processed by the sensory cells. Part of this compressive transformation is a consequence of the processing by the outer hair cells (OHC) in the cochlea. Although there is still some controversy about the precise mechanism underlying OHC function [e.g., 1,2,3,4,5], it is broadly accepted that OHC electro-motility provides a level-dependent gain to the movement of the basilar membrane (BM) in the healthy cochlea. This leads to a high sensitivity to low-level sounds and a compressive input-output (I/O) function [6] at the characteristic place of the BM for tonal stimuli. In addition to increasing the sensitivity, OHC function has also been associated with high frequency selectivity and a normal loudness growth with sound pressure level [7].
Invasive physiological recordings in living non-human mammals allow precise measures of place-specific BM velocity-level functions using pure-tone stimuli [e.g., 8,9,10,11,12]. For a pure tone, the envelope of the resulting travelling wave shows a maximum at one specific cochlear place. Magnitude-level functions measured at or near this place ("on-frequency") show a compressive growth of BM velocity with increasing sound pressure level (SPL), consistent with the idea of a level-dependent amplification. Basal and apical to this place ("off-frequency"), the magnitude-level functions show a linear growth [Figs. 6 and 7 in 10]. The combination of nonlinear on-frequency and linear off-frequency magnitude-level functions leads to a level-dependent BM excitation pattern with sharp tuning at low levels and broader tuning at higher levels. In the case of OHC dysfunction, on-frequency magnitude-level functions show reduced compression. This leads to a lower-amplitude on-frequency

EFR magnitude-level functions
Normal-hearing listeners Figure 1 shows the recorded EFR magnitude-level functions for one representative NH listener (NH01) for 500 Hz, 1000 Hz, 2000 Hz and 4000 Hz (panels a-d), respectively. The complete set of EFR data for all NH listeners is shown in Supplementary  Fig. 7. Circles indicate the EFR magnitudes recorded in the first recording session and red squares indicate EFR magnitudes recorded in the second recording session. EEG background noise estimates are shown as the grey shaded area. The best fitted curves are represented by the solid dark-grey functions. A linear reference with a slope of 1 dB/dB is indicated by the dotted line.
All EFR magnitude-level functions were found to grow monotonically and compressively (with slopes < 1 dB/dB) for stimulus levels between 20 and 50-65 dB SPL. The EFR magnitude-level functions obtained for the carrier frequencies 500, 1000 and 2000 Hz (panels a-c in Supplementary Fig. 7) showed a different trend than that obtained for the 4000 Hz carrier (panel d). At 500, 1000 and 2000 Hz, the EFR magnitudes saturated, or slightly decreased, for stimulus levels above 50-65 dB SPL, leading to a break-point in the magnitude-level function. Figure 2a (blue symbols) shows box-plots indicating the fitted break-point levels at the four carrier frequencies. The median values for the break-point levels varied between 50 to 65 dB SPL. In contrast, no break-point was observed at 4000 Hz. At this frequency, the magnitude-level function was found to grow monotonically with a single slope (see also Supplementary Table 2). Figure 2b (blue symbols) shows the median values of the EFR slopes, which amounted to 0.24 dB/dB at 500 Hz, 0.31 dB/dB at 1000 Hz, 0.25 dB/dB at 2000 Hz and 0.21 dB/dB at 4000 Hz. A two-sample permutation test for equality of the means [43,44] revealed that the estimated EFR slopes in the NH listeners were not statistically different across frequency, except for the conditions at 1000 Hz vs 4000 Hz (Test statistic = 0.1418, p = 0.0277). NH HI Figure 2. Fitted parameters to the EFR magnitude-level functions for the NH and HI listeners. Panel a) shows box-plots with the fitted slopes obtained for the different carrier frequencies in the NH (blue) and HI (red) listeners. Panel b) shows box-plots with the fitted break-point levels for the different carrier frequencies using a two-slopes piecewise fit. The bottom and the top of each box represent the first and third quartiles, respectively, and the band inside each box represents the second quartile (the median). Whiskers indicate 1.5 times the interquartile range (IQR) of the lower and upper quartiles. The circles depict the raw observations. Statistical significance is represented by the asterisks, where ** corresponds to p ≤ 0.01. Figure 3 shows the repeatability of the EFR amplitudes, following the method proposed by [45], for all carrier frequencies (500, 1000, 2000 and 4000 Hz; panels a-d) and stimulus levels of 35, 55 and 70 dB SPL (left, middle and right subpanels). The repeatability coefficient of the EFR amplitudes, defined as twice the standard deviation of the differences between the test-retest EFR measurements and indicated by the grey horizontal dashed lines, increased with increasing stimulus level at 500 Hz and 4000 Hz, decreased with increasing stimulus level at 1000 Hz, and hardly varied with stimulus level at 2000 Hz. Thus, no consistent pattern across frequencies was found (see also Supplementary Table 3). The repeatability results obtained in this study were similar to those presented in previous studies [46,47], even though the EEG recording systems, stimuli and listeners differed across studies. Figure 4 shows the EFR magnitude-level functions for one representative HI listener (HI01). The complete set of EFR data for all HI listeners is shown in Supplementary Fig. 8. The EFR magnitude-level functions for 500, 1000 and 2000 Hz carrier frequencies (panels a-d) showed similar trends as the ones observed for the NH listeners (shown in Fig. 1). At 4000 Hz (Fig. 4d), the EFR magnitudes for stimulus levels up to 60 dB SPL were not statistically different from the EEG noise floor, whereas  In each panel plot, the ordinate indicates the difference of the EFR amplitudes obtained in the test and the retest, and the abscissa indicates the mean of the EFR amplitude. The upper and lower horizontal grey dashed lines indicate the positive and negative repeatability coefficient, respectively, defined as twice the standard deviation of the test-retest difference. N indicates the number of data points considered in each condition for all statistically significant EFR responses.

Hearing-impaired listeners
significant EFR magnitudes were obtained above 60 dB SPL, showing a compressive growth with level (slope < 1 dB/dB). This frequency is within the region of reduced sensitivity in this listeners' audiogram (red arrow in panel d). Overall, the EFR magnitudes recorded in some of the HI listeners showed a lower signal-to-noise ratio (SNR) than in the NH listeners, resulting in a larger number of statistically non-significant data points (see Supplementary Table 4). The slopes of the EFR magnitude-level functions at 500, 1000 and 2000 Hz (i.e. the frequencies where all listeners were considered to have "normal" audiometric thresholds) were not statistically different between the NH and the HI listeners (see Fig. 2a). In contrast, the EFR slopes at 4000 Hz were significantly steeper (higher values) for the HI listeners than for the NH listeners (Test statistic = 0.2933, p = 0.0012). The median values of the EFR slopes for the HI listeners were 0.40 dB/dB, 0.33 dB/dB, 0.24 dB/dB and 0.57 dB/dB for the carrier frequencies 500, 1000, 2000 and 4000 Hz, respectively. Figure 5 shows simulated neural activity derived from the AN model of [40,41] in response to the same four SAM tones as considered in the experimental recordings. The leftmost column (panels a, c, e and g) shows the envelope-based neural activity,

5/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint  throughout the manuscript referred to as EFR AN , for normal hearing (for details regarding the calculation of EFR AN from the AN model output, see the Methods section). Panels a, c, e and g show the results for the SAM tones at 0.5, 1, 2,and 4 kHz, respectively. The second (middle) column shows the corresponding response for impaired hearing, assuming a combination of 2 /3 of OHC dysfunction and 1 /3 of IHC dysfunction [48,49] to account for the mean audiogram values in the HI listener group. The rightmost column in Fig. 5 (panels i-l) shows the simulated EFR AN magnitude-level functions obtained by summing up all neural activity across CF, representing the simulated "gross" AN activity.
The AN responses for the carrier frequencies 500, 1000 and 2000 Hz were similar for the NH and the HI simulations (panels a-f), resulting in almost identical EFR AN magnitude-level functions (panels i-k). At 4000 Hz, EFR AN magnitudes for the HI simulations (panel h and red function in panel l) were not statistically significant above noise floor at low stimulus intensities, thus reflecting a threshold elevation. Here, at threshold (i.e., at about 30-40 dB SPL), AN neurons tuned to a broader range of CFs showed phase-locked responses to the envelope frequency than in the case of the NH simulations (panel g and blue function in panel l) with a threshold at 0 dB SPL). This is consistent with the broadening of frequency tuning observed in AN neurons in cochlear regions with OHC dysfunction [50]. Thus, hair-cell dysfunction leads to abnormal EFR AN magnitude-level functions, with non-significant responses at low input levels and a steeper (less compressive) growth function at higher input levels.
The broadening of the range of contributing AN neurons with increasing stimulus level was found for all carrier frequencies. For each carrier frequency, the AN activity is limited to a narrow "on-frequency" region at low stimulus levels (indicated by the horizontal orange dashed lines in panels a-h and defined as CFs ranging from 1 /2-octave below and 1 /3-octave above the carrier frequency of the SAM tone [10]). With increasing stimulus level, the range of AN activity broadens towards off-frequency regions due to the recruitment of neurons tuned to higher CFs. This broadening is continuous for the 4000 Hz carrier whereas there is a saturation in the EFR AN magnitude-level functions (panels i-l) obtained for the lower-frequency carriers due to an  [48,49] was assumed to adjust the AN model parameters to account for the mean audiogram values in each listener's group.
interference between the neural activity at the higher and the lower frequency carriers. The model also shows that very basal AN neurons at high stimulus levels encode multiple concurrent modulations, reflected by the simulated AN activity in the lower right corner of panels a-g.

7/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Compression estimates based on EFR magnitude-level functions
The slopes of the EFR magnitude-level function for the NH listeners varied between 0.2 and 0.35 dB/dB, and were not statistically different from the slopes at the non-impaired frequencies (500, 1000 and 2000 Hz) for the HI listeners (see Fig. 2a). On a group average, these values are similar to BM compression estimates obtained using direct invasive methods in healthy non-human animal models, and with non-invasive physiological and behavioural compression estimates in NH humans. Compression values estimated from the slope of BM velocity-intensity I/O functions in chinchillas at medium-to-high stimulation levels (40-90 dB SPL) varied between 0.2 and 0.5 dB/dB [10]. Group-averaged psychoacoustical compression estimates in NH humans were found to be between 0.15 to 0.35 dB/dB [e.g., 13,14,51,52,53,54,55]. Compression estimates using group averaged DPOAE magnitude-level functions (without source separation; [26,27]) were shown to be of about 0.2 dB/dB in NH listeners at moderate stimulus levels (50 to 70 dB SPL) [e.g., 25,24,56]. Regarding the EFR recordings obtained with the HI listeners, some of the characteristics in the data also seem consistent with the results obtained with the other methods. The slopes at 4000 Hz (where the HI listeners showed a mild hearing loss) were significantly higher than the corresponding slopes for the NH listeners. The steeper growth function in the HI listeners is consistent with the reduced compression observed with other methods [e.g., 57,13,14,56,58]. In addition, the increase of the lowest stimulus level at which an EFR could be measured in the HI listeners is consistent with the corresponding increased pure-tone threshold at that frequency [see also , 35, 59, 60, 61, 62, 63, 64]. Thus, based on the group-averaged numerical values, the similarity of the compression estimates obtained across the different methods, including the EFR, may suggest that similar aspects of peripheral auditory processing are reflected in the different measures.
However, while the slope of the compressive part of the respective level-growth functions is similar, the overall shape of the magnitude-level functions differs. For example, while a change in the slope of the magnitude-level functions (often referred to as a "break-point" or "knee point") can be identified both in the EFR results as well as in the behavioural measures, there are substantial differences. In fact, in the behavioural studies, a break-point has typically been estimated [e.g., 14,51,65,7,54] at stimulus levels at or below about 45 dB SPL, whereby the slope of the estimated BM I/O function has been close to one, reflecting a linear growth. The slope beyond the break-point, at medium-to-high levels, commonly has shown a compressive growth in NH listeners. In contrast, for the EFR magnitude-level functions obtained in the present study, no linear growth was found, in any of the listeners at any frequency, at the lowest levels considered. While this characteristic of the EFR magnitude-level seems inconsistent with the behaviourally estimated BM I/O functions, it is not inconsistent with data from non-human animal recordings which show that a linearised BM growth with input level occur only at stimulus levels below 20 dB SPL [see Fig. 3 in 10], which were input levels not tested in the present EFR study. Furthermore, the EFR magnitude-level functions showed a break-point at about 50-65 dB SPL (see Fig 2b); i.e., at higher levels than in the behavioural studies. This break-point actually reflected transition between the compressive growth at low-medium stimulus levels and the level region beyond this point where the EFR magnitudes saturated (at 500, 1000 and 2000 Hz), consistent with a previous study [37]. Thus, the characteristic of the EFR magnitude-level function, including its compressive behaviour, do not seem to reflect the same processes that underlie the behaviourally estimated BM I/O function. The same may hold in relation to the level-growth functions obtained with non-invasive physiological methods such as DPOAEs as well as the invasive (non-human) measures, as further outlined below.

On-versus off-frequency contributions to compression estimates
The compressive growth of BM I/O functions measured locally in animal models reflects "on-frequency" responses at a narrow BM range. At "off-frequency" places, BM I/O functions have been demonstrated to grow linearly [see Figs. 6 and 7 in 10]. Thus, in order to estimate on-frequency (i.e., place-specific) compression using EFRs (or any other method), the response needs to be dominated by on-frequency processing. At low intensities, a narrowband stimulus excites a narrow region of the BM and the AN. Thus, the EFR responses are likely to be dominated by the activity of a small population of neurons tuned to the centre frequency of the stimulus. However, at medium and high stimulus levels, the excitation pattern on the BM broadens and a larger population of AN neurons tuned to frequencies remote from the centre frequency contribute to the gross activity. Indeed, based on simulations of neural activity at the level of the AN using the model by [40,41], responses to single SAM tone presented at medium-to-high stimulus levels are dominated by contributions from off-frequency high-spontaneous rate (SR) fibres [66], despite the fact that the maximum of the BM excitation is located on-frequency.
As shown in Fig 5 (see panels a-f), the presence of a SAM tone of higher carrier frequency prevented a SAM tone at a lower frequency to recruit AN neurons tuned to higher CFs. This was not the case for the 4000 Hz component (panels g and h), where off-frequency high-CF neurons could be recruited without interference from another SAM tone. This is supported by findings from invasive recordings in non-human animals showing that AN fibres can follow the periodicity of a high level tone with frequency energy below the CF of the fibre [e.g., 67,68]. Also, EFRs recorded in rats have shown the effect of a second, high-frequency SAM carrier onto the encoding of a low-frequency SAM carrier [69]. Consistent with our model simulations,

8/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint .
this study concluded that the presence of the second SAM tone basal to the place of the main low-frequency SAM tone caused a reduction of the EFR to the lower-frequency component due to reduced recruitment of AN fibres located basally relative to the on-frequency location.
Such interaction of high-frequency SAM carriers onto the AN activity induced by a lower-frequency carriers is the reason underlying the saturation of the simulated EFR AN magnitude-level functions above 50-60 dB SPL, as observed particularly at 500 and 1000 Hz (panels i and j in Fig. 5). Simulated EFR AN magnitude-level functions using a single SAM tone at 2000 Hz resulted in non-saturating (monotonically increasing) growth functions, because off-frequency high-CF neurons could be recruited at high stimulus levels, strongly contributing to the compound response (see Fig. 4a in [66]). Thus, the model offers an explanation for the saturation of the EFR magnitude-level functions observed at 500, 1000 and 2000 Hz in the recorded data (see panels a-c in Fig. 1 and Fig. 4). At 4000 Hz, however, the model does not show a fully monotonic growth of EFR AN magnitudes up to at least 80 dB SPL, which is not consistent with the data (panel d in Fig. 1 and panel b in Fig. 2). A too sharp AN tuning implemented in the humanized version of the AN model at those high frequency CFs might be a possible explanation for the lack of off-frequency contributions at 4000 Hz above 70-80 dB SPL.
The orange dashed lines in Fig. 5 indicate the limits of the "on-frequency" range corresponding to direct BM recordings in non-human mammals [10]. Panels a-h show that the EFR AN magnitude first increases with input level within the on-frequency range, shows a maximum at about 25-30 dB above threshold, and then decreases with increasing stimulus level. At mediumto-high stimulus levels, the modulation (and therefore the EFR AN ) was found to be more dominant at off-frequency CFs than at on-frequency CFs (see also panels a and b in Fig. 6). At these high levels, many on-frequency high-SR fibres will saturate and do not strongly encode the modulations. In contrast, AN neurons tuned at off-frequency CFs (including high-SR fibres) are still excited below saturation level, due to its lower off-frequency sensitivity (as reflected in their tuning curves). Thus, this neurons more robustly encode the amplitude modulations at these levels. This is consistent with physiological recordings from the cat AN showing that off-frequency AN fibres exhibit higher synchrony to high intensity SAM tones than on-frequency AN fibres [70]. Nonetheless, the EFR AN magnitude-level function obtained after summing across CF continued to grow monotonically due to further increasing of off-frequency contributions. Panels a and b in Fig. 6 show the overall contribution of on-and off-frequency CFs to the compound response to the 4 kHz SAM tone. The NH simulations are shown in blue and the HI simulations are indicated in red. The compressive slope estimated by fitting a first-order polynomial to the EFR AN magnitude-level function from 20 to 60 dB SPL results from the mixture of on-and off-frequency contributions. Therefore, while BM compression purely reflects on-frequency processing, the estimates of compression obtained from EFR magnitude-level functions do not exclusively reflect on-frequency cochlear compression. This is consistent with the limitations of estimating place-specific cochlear dispersion using EFRs [71].
IHC dysfunction (but intact OHC function) is considered to lead to a BM I/O function with comparable compression estimates as in the NH listeners [e.g., 51]. Figure 6c shows simulations of the 4-kHz EFR AN magnitude-level function when accounting for the mild threshold elevation with only OHC dysfunction (red crosses, dashed line) or with only IHC dysfunction (red squares, dotted line). Even though both types of hair-cell impairment led to a similar threshold elevation of 30-40 dB SPL, the growth function (up to the maximum) was steeper in the case of only OHC dysfunction than in the case of only IHC dysfunction. More interestingly, both growth functions show a steeper slope (less compressive) than the NH reference (blue circles). Thus, also in the case of only IHC loss, where there is no reduction of cochlear gain and "normal" BM compression (see the sharply tuned response in Supplementary Fig. 10h), the EFR AN magnitude-level function exhibited a steeper growth function. This less compressive growth of the EFRs AN also occurred in the on-frequency band, since the responses obtained in the level range from 40 to 65 dB SPL were mainly dominated by on-frequency CFs (see panels b and c in Fig. 6).
The assumption of reflecting frequency-specific local compression might also be challenged in the case of other measurement paradigms, such as those based on DPOAEs or psychoacoustical masking paradigms. Regarding DPOAEs, the distortion source of the emission is usually simplified as a single source located at the peak of the travelling wave envelope, although many distortion sources might be induced in the region where the two primaries travelling waves overlap [72], which strongly depends on the level and frequency ratio of the primaries. The extent of potential off-frequency contributions to the non-linear component of the DPOAE at high stimulus levels is not yet fully understood [73]. Regarding behaviourally obtained estimates of BM I/O functions, on-and off-frequency maskers have been used in a forward-masking paradigm. Using high-level off-frequency maskers may thereby lead to an overestimation of compression by as much as a factor of 2 [74]. Furthermore, physiological recordings in non-human mammals demonstrated that the amount of forward masking in the AN is not large enough to account for the behavioural forward masking, whereas physiological masking at the level of the inferior collicullus seems to be reflecting behavioural patterns [20]. Thus, behaviourally estimated BM I/O functions derived from forward masking paradigms may reflect information about on-and off-frequency BM processing and about mechanisms beyond cochlear processing. A modelling analysis as the one provided in the current study for evaluating the potential peripheral neural generators contributing to the EFRs might be useful also for exploring the contributing factors underlying level-growth functions obtained with, e.g., DPOAEs and psychoacoustical masking measures that have been used to estimate cochlear compression. For instance, cochlear 9/23 . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint

a)
On-vs Off-frequency 10   transmission-line models [e.g., 75,76] could be used to validate methods based on OAEs, and an integration of the current AN model [40,41] with signal detection theory methods [e.g., 77,78] could be applied to validate behavioural estimates of compression.

Conclusion
The recorded EFR magnitude-level functions showed a compressive growth with slopes comparable to compression estimates using direct physiological recording in non-human mammals, and group-averaged results from psychoacoustical methods as well as DPOAE magnitude-level functions. Moreover, in the case of a mild threshold elevation, the estimated slopes were higher than in the case of normal hearing, also consistent with the interpretation of a less compressive response due to reduction of cochlear gain. However, simulations obtained with a computational AN model revealed that the slope estimated from the EFR magnitude-level function contained significant off-frequency contributions, which caused a saturation above 60 dB SPL observed at the carrier frequencies of 0.5, 1 and 2 kHz, but not at the highest carrier frequency of 4 kHz. The results demonstrate that methods based on added responses across tonotopic locations, like EFRs, cannot reflect an on-frequency phenomenon when medium-to-high stimulus levels are used. Furthermore, steeper growth functions in the recorded EFRs, that have been associated with reduced compression, were also observed, in the framework of the model, as a consequence of only IHC dysfunction. Overall, the results from the present study strongly suggest that the compressive slope estimated from EFR magnitude-level functions cannot be used as a proxy of cochlear compression.

10/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Methods Listeners
Twenty adult listeners (10 females, 34.0 ± 15.9 years) participated in this study, separated into groups of 13 NH (8 females, 24 ± 3.2 years) and 7 HI (2 females, 56.2 ± 12.7 years) listeners. All NH listeners had audiometric thresholds below 15 dB HL at octave frequencies between 125 and 8000 Hz. All HI listeners were selected to have normal-hearing thresholds (≤ 20 dB HL) below 4000 Hz and a mild hearing impairment at 4000 Hz and above, with audiometric thresholds between 20 and 45 dB HL.
All participants provided informed consent and all experiments were approved by the Science-Ethics Committee for the Capital Region of Denmark (reference H-3-2013-004). The experiments were carried out in accordance with the corresponding guidelines and regulations on the use of human subjects. The listeners were economically compensated for participating in the experiments.

Apparatus
The EFR recordings were performed in a dark, soundproof and electrically shielded booth, where the participants were seated in a comfortable reclined armchair. The participants were instructed to close their eyes and relax to avoid moving and were allowed to sleep. The recording and data analysis routines were implemented in MATLAB (The MathWorks, Inc., Natick, Massachusetts, USA). All acoustic stimuli were generated in MATLAB and presented using PLAYREC 2.1 (Humphrey, R., www.playrec.co.uk, 2008-2014) via a RME Fireface UCX soundcard (sampling rate f s|sound = 48 kHz, 24 bits). The analogue acoustic signal was passed to a headphone buffer (HB7, Tucker-Davis Technologies) with a gain of -6 dB (stimulus levels > 55 dB SPL) or -27 dB (stimulus levels ≤ 55 dB SPL). The attenuated signal was presented through a pair of ER-2 insert earphones (Etymotic Research Inc.) mounted on an ER-10B+ low-noise DPOAE microphone probe (Etymotic Research Inc.) with ER10-14 foam eartips.
EFRs were recorded using a Biosemi ActiveTwo system (sampling rate f s|EFR = 8192 Hz, 24 bits). Five active pin-type electrodes were used. Three electrodes were mounted at positions P10, P9 (right and left extremes of the parietal coronal line) and Cz (vertex) following the 10-20 system [79]. The remaining two electrodes (common mode sense, CMS and driven right leg, DRL) were placed at the centre of the parieto-occipital coronal line (on either side of electrode POz). The electrodes CMS and DRL form a feedback loop that replace the "ground" electrode (the zero) in conventional EEG systems [80]. Conductive electrode gel was applied and the offset voltage was stabilised at < 20 mV for each electrode. The recorded EEG signals were down sampled by a factor of 2, and low-pass filtered by the EEG amplifier with a bandwidth limit of 1 /5 th of the final sampling frequency (about 820 Hz). The EEG data were stored to hard disk. The results shown in this study represent the Cz-P10 potential in response to right-ear stimulation, and the Cz-P9 potential in response to left-ear stimulation.

EFR recordings
The EFR data were recorded in two sessions. In the first session (approx. two hours in duration), the EFR magnitude-level functions were recorded in the NH listeners using input levels in the range from 20 to 80 dB SPL, in steps of 5 dB. The second recording session (approx. 45 minutes in duration) took place on a different day usually about one month later than the first session. Three input levels (35, 55 and 70 dB SPL) were recorded again in the same NH listeners to evaluate the repeatability of the results. In all NH listeners, the right ear was stimulated. In the HI group, the multi-frequency recording was carried out in the level range from 30 to 80 dB SPL, in steps of 5 dB. Here, the recording ear was chosen depending on the individual listener's audiogram, such that the amount of sensitivity loss due to the hearing impairment was as similar as possible within the group. There was no second recording session to evaluate repeatability for this subject group.
A multi-frequency stimulus consisting of four SAM tones was used. The SAM tones had carrier frequencies of 498, 1000, 2005 and 4011 Hz (referred to as 500, 1000, 2000 and 4000 Hz throughout this work) modulated at 81, 87, 93 and 98 Hz, respectively. The modulation depth was set to m = 85%. The four SAM tones were calibrated individually (B&K 4157 ear simulator) and added, resulting in a final stimulus level that was 6 dB higher than that of each individual SAM tone. The stimuli were digitally generated as 1-s long epochs and continuously presented to the listener, where a trigger signal marked the beginning of a new epoch for later averaging. The total stimulus duration depended on the stimulus intensity to achieve a statistically significant EFR SNR, based on a pilot study. Table 1 shows the stimuli duration used for each input level in the EFR recordings. . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.02. 19.20024919 doi: medRxiv preprint The recorded EEG data were filtered using a fourth-order Butterworth digital band-pass filter with cut-off frequencies of 60 and 400 Hz, applied serially in forward and backward direction to yield zero phase. All recorded epochs with a maximum absolute amplitude that exceeded a voltage threshold of 80 µV in any of the channels were rejected to remove artefacts and noisy event from the average pool. Sixteen 1-s long epochs of EEG data were concatenated forming a trial to achieve a higher frequency resolution in the EFR spectrum analysis. In order to increase the SNR, the 16-s long trials were ensemble weighted averaged, where the inverse of the variance on each 1-s long epochs was used as weights [81].
A F-test was used to identify statistically significant responses by comparing the spectral power at the modulation frequency (EFR frequency) to the noise power in the range of 3 Hz below and above the modulation frequency [82,28]. The power ratio (F-ratio) was calculated as the power in the EFR frequency bin divided by the averaged power in 3 Hz below and above the modulation frequency (96 bins). The probability (p) of the EFR power being different from the noise power can be calculated as 1 − F, with F representing the cumulative distribution function of the power ratio. The F-test was defined to be positive if p ≤ 0.01 (F critical value ≤ 4.8333, SNR>5.84 dB), implying that the EFR frequency was statistically significant from the noise estimate. The F-test was custom implemented in MATLAB.
Compression was estimated from the slope of the EFR magnitude-level function. To estimate such slope, a piecewise linear function with two segments was used. The model was fitted to each individual EFR magnitude-level data using a non-linear least squares fitting method described by: where L s represents the stimulus input level, s 1 the lower slope, s 2 the upper slope and b x and b y represent the value on the abscissa and the ordinate at the break-point respectively. Motivated by the BM I/O function characteristics observed either in direct animal physiological recordings [e.g., 10] or used for human psychoacoustical estimates of cochlear compression [e.g., 13,14,7,15], the lower slope was forced to be larger than the upper slope (i.e. s 1 > s 2 ), otherwise a single-slope (first-order polynomial) was used. Moreover, if a single-slope model was found to provide a better fit than the two-slopes piecewise functions (based on an adjusted R 2 statistic) for a given EFR magnitude-level function, this simpler model was used. Also, the two-slopes piecewise function was only considered if at least 3 significant datapoints were present on each segment of the fitting function. Only EFR readings significantly above the background noise floor were included in the fitting procedure. The accuracy of the fitted slope to the EFR magnitude-level function will depend on the test-retest variability of each individual EFR data point. In order to have an estimate of the measurement variability, the repeatability of EFR responses at three stimulus levels was assessed as proposed by [45]. The test-retest difference values were plotted against the mean response amplitude between two test runs. This method defines the test repeatability coefficient as twice the standard deviation of the differences. Repeatability can also be expressed as a percentage of the mean amplitude within a given frequency-level group (repeatability coefficient / mean of the group · 100), termed repeatability variability [47,46].
In order to test whether the estimated EFR slopes at two different frequencies were statistically different from each other, a two-sample permutation test for equality of the means was used [43,44]. The test evaluates the hypothesis that the estimated EFR slopes for two given frequencies were a random partition of both frequency data added together, against the alternative hypothesis that the EFR slopes from one frequency were part of a population with a different mean than the other frequency. The test was performed using 100000 permutations using the Permute package implemented in Python [83].

AN model
A humanised phenomenological AN model [40,41] was used to simulate the activity of the AN. The implementation of the AN model is the same as described in [66]. In short, the model is implemented by computing a total of 32000 AN fibres distributed non-uniformly (with more density of fibres at mid CFs based on [84]) through 200 CFs (cochlear segments or IHCs) ranging from 0.2 to 20 kHz. For each CF, a 61% of high-SR fibres, 23% of medium-SR fibres and 16% of low-SR fibres were considered [85]. Hair-cell impairment was implemented by fitting the listener's audiogram using the fitaudiogram2 MATLAB function implemented by [40]. This function allows to define the proportion of threshold elevation that is attributed to either OHC or IHC dysfunction.
To simulate EFR AN magnitude-level functions, the same stimulus as the one used in the recordings consisting of 4 simultaneous SAM tones was presented to the model but of a duration of 1.2-s. The stimuli were calibrated and presented to the AN model ranging from 5 to 100 dB SPL in steps of 5 dB. The spike trains obtained from the independently computed AN neurons for a given CF and fibre type were added together to obtain the summed AN activity at that CF, which is comparable to the peri-stimulus time histogram (PSTH) used to describe data from single neurons in experimental recordings. In order to analyse the steady-state encoding of a modulation, a 1-s long steady-state response, excluding on-and offsets, was considered. A Fast Fourier Transform (FFT) was performed on the resulting synaptic output and the magnitude value at the modulation frequency bin was considered the simulated EFR AN .

12/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint .
In order to easily visualise which CFs may contribute to the total EFR AN response, heatmap plots showing EFR AN magnitude as a function of CF and stimulus levels were used (see Fig. 5a-g). In those plots, for each combination of CF and stimulus level, the colour represents the magnitude of the EFR AN obtained from the frequency bin corresponding to the modulation frequency of interest (different for each carrier frequency) in the spectrum of the summed AN activity at that CF (the PSTH for that particular CF). The simulated EFR AN magnitude-level functions (see Fig. 5h-k) are obtained by summing all the AN simulated activity across CFs and reading the magnitude at the modulation frequency bin from the spectrum of the summed PSTHs. For the analysis done at the on-and off-frequency bands, the same procedure is performed over the summed AN activity of all CF within the definition of on-frequency band, or over the summed AN activity of all CFs except the ones of the on-frequency band (off-frequency). The on-frequency band was defined as the CFs ranging from 1 /2-octave lower and 1 /3-octave higher than the carrier frequency of the SAM tone (a fractional bandwidth of ≈ 28%), based on velocity-intensity functions recorded directly in the BM of non-human animal [

17/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint

18/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint .

19/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . Characteristic frequency (kHz) EFR AN magnitude (dB in a.u.) Figure 10. Simulated EFR at the level of the AN (EFR AN ) obtained with four simultaneously presented SAM tones as the ones used in the experiments for NH and HI assuming only IHC dysfunction. Same representation as in Fig. 9 but assuming only IHC dysfunction to adjust the AN model parameters to account for the HI mean audiogram values.

20/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint .  . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.02.19.20024919 doi: medRxiv preprint Table 3. Repeatability values for all NH listeners at stimulation levels of 35, 55 and 70 dB SPL. EFR repeatability coefficient in nV. Mean in dB re 1µV of the test-retest EFR amplitudes mean for each level group ± repeatability coefficient also converted to dB (in brackets). Repeatability variability in percentage and repeatability variability derived from [47].

22/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

23/23
. CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.