Binaural summation of amplitude modulation involves weak interaural suppression

Baker, D. H.; Vilidaite, G.; McClarnon, E.; Valkova, E.; Bruno, A.; Millman, R. E.

doi:10.1038/s41598-020-60602-5

Download PDF

Article
Open access
Published: 26 February 2020

Binaural summation of amplitude modulation involves weak interaural suppression

D. H. Baker ORCID: orcid.org/0000-0002-0161-443X^1,2,
G. Vilidaite^1,3,
E. McClarnon ORCID: orcid.org/0000-0002-2969-2819¹,
E. Valkova¹,
A. Bruno¹ &
…
R. E. Millman^4,5

Scientific Reports volume 10, Article number: 3560 (2020) Cite this article

2409 Accesses
5 Citations
2 Altmetric
Metrics details

Subjects

Abstract

The brain combines sounds from the two ears, but what is the algorithm used to achieve this summation of signals? Here we combine psychophysical amplitude modulation discrimination and steady-state electroencephalography (EEG) data to investigate the architecture of binaural combination for amplitude-modulated tones. Discrimination thresholds followed a ‘dipper’ shaped function of pedestal modulation depth, and were consistently lower for binaural than monaural presentation of modulated tones. The EEG responses were greater for binaural than monaural presentation of modulated tones, and when a masker was presented to one ear, it produced only weak suppression of the response to a signal presented to the other ear. Both data sets were well-fit by a computational model originally derived for visual signal combination, but with suppression between the two channels (ears) being much weaker than in binocular vision. We suggest that the distinct ecological constraints on vision and hearing can explain this difference, if it is assumed that the brain avoids over-representing sensory signals originating from a single object. These findings position our understanding of binaural summation in a broader context of work on sensory signal combination in the brain, and delineate the similarities and differences between vision and hearing.

A hemispheric two-channel code accounts for binaural unmasking in humans

Article Open access 22 October 2022

Cortical temporal integration can account for limits of temporal perception: investigations in the binaural system

Article Open access 26 September 2023

Neural Responses and Perceptual Sensitivity to Sound Depend on Sound-Level Statistics

Article Open access 12 June 2020

Introduction

The auditory system integrates information across the two ears. This operation confers several benefits, including increased sensitivity to low intensity sounds¹ and inferring location and motion direction of sound sources based on interaural time differences². In some animals, such as bats and dolphins, echolocation can be precise enough to permit navigation through the environment, and there are reports of visually impaired humans using a similar strategy^3,4, which requires both ears⁵. But what precisely is the algorithm that governs the combination of sounds across the ears? The nonlinearities inherent in sensory processing mean that simple linear signal addition is unlikely. This study uses complementary techniques (psychophysics, steady-state electroencephalography (EEG) and computational modelling) to probe the neural operations that underpin binaural summation of amplitude-modulated signals.

Classical psychophysical studies demonstrated that the threshold for detecting a very faint tone is lower when the tone is presented binaurally versus monaurally. Shaw et al.¹ presented signals to the two ears that were equated for each ear’s individual threshold sound level when presented binaurally. This accounted for any differences in sensitivity (or audibility), and revealed that summation (the improvement in sensitivity afforded by binaural presentation) was approximately 3.6 dB (a factor of 1.5). Subsequent studies have provided similar or slightly lower values^6,7,8, and there is general agreement that two ears are better than one at detection threshold⁹. This difference persists above threshold, with intensity discrimination performance being better binaurally than monaurally¹⁰. Furthermore, binaural sounds are perceived as being slightly louder than monaural sounds, though typically less than twice as loud^11,12,13,14.

When a carrier stimulus (typically either a pure-tone or broadband noise) is modulated in amplitude, neural oscillations at the modulation frequency can be detected at the scalp^{15,16,17,18,19}, being typically strongest at the vertex in EEG recordings²⁰. This steady-state auditory evoked potential (SSAEP) is greatest around 40 Hz^20,21 and increases monotonically with increasing modulation depth^17,18. For low signal modulation frequencies (<55 Hz), brain responses are thought to reflect cortical processes^15,20,21,22. The SSAEP has been used to study binaural interactions, showing evidence of interaural suppression^23,24 and increased responses from binaurally summed stimuli^17,22,25.

The perception of amplitude-modulated stimuli shows similar properties to the perception of pure-tones in terms of binaural processing. For example, binaural sensitivity to amplitude modulation (AM) is better than monaural sensitivity^26,27, and the perceived modulation depth is approximately the average of the two monaural modulation depths over a wide range²⁸. Presenting two different modulation frequencies to the left and right ears can produce the percept of a ‘binaural beat’ pattern at the difference intermodulation frequency (the highest minus the lowest frequency), suggesting that the two modulation frequencies are combined centrally²⁹. Finally, both the detection of intensity increments and detection of AM³⁰ follows Weber-like behaviour^31,32 at higher pedestal levels (i.e. Weber fractions for discrimination are approximately constant with pedestal level) similar to that typically reported for intensity discrimination³³. However, despite these observations, detailed investigation and modelling of the binaural processing of amplitude-modulated tones is lacking.

Computational predictions for both psychophysical and electrophysiological results can be obtained from parallel work that considers the combination of visual signals across the left and right eyes. In several previous studies, a single model of binocular combination has been shown to successfully account for the pattern of results from psychophysical contrast discrimination and matching tasks^34,35, as well as steady-state EEG experiments³⁶. The model, shown schematically in Fig. 1a, takes contrast signals (sinusoidal modulations of luminance) from the left and right eyes, which mutually inhibit each other before being summed as follows:

$$resp=\frac{{C}_{L}^{p}}{{Z}^{q}+{C}_{L}^{q}+\omega {C}_{R}^{q}}+\frac{{C}_{R}^{p}}{{Z}^{q}+{C}_{R}^{q}+\omega {C}_{L}^{q}},$$

(1)

where resp is the model response, C_L and C_R are the contrast signals in the left and right eyes respectively, ω is the weight of interocular suppression, Z is a constant governing the gain of the model, and p and q are exponents with the typical constraint that p > q. In all experiments in which the two signals have the same visual properties^34,35,36, the weight of interocular suppression has a value around ω = 1 (and is assumed to be effectively instantaneous, though in reality is likely subject to some delay).

Whereas vision studies typically modulate luminance relative to a mean background level (i.e. contrast), in hearing studies the amplitude modulation of a carrier waveform can be used to achieve the same effect. We can therefore test empirically whether binaural signal combination is governed by the same basic algorithm (in the tradition associated with David Marr³⁷) as binocular signal combination by replacing the C terms in Eq. 1 with modulation depths for AM stimuli.

The response of the model for different combinations of inputs is shown in Fig. 1b, with predictions being invariant to the sensory modality (hearing or vision). In the monaural/monocular (“mon”) condition (blue), signals are presented to one channel only. In the binaural/binocular (“bin”) condition (red) equal signals are presented to the two channels. In the dichotic/dichoptic (“dich”) condition (green) a signal is presented to one channel, with a fixed high amplitude ‘masker’ presented to the other channel throughout. For ω = 1, the mon and bin conditions produce similar outputs, despite a doubling of the input (two channels vs one). This occurs because the strong suppression between channels offsets the gain in the input signal. This pattern of responses is consistent with the amplitudes recorded from steady-state visual evoked potential experiments testing binocular combination in humans³⁶.

The model response can also be used to predict the results of psychophysical increment detection experiments in which thresholds are measured for discriminating increases in the intensity of a ‘pedestal’ stimulus (e.g. a stimulus of fixed intensity that is present in both intervals of a trial). In these experiments, thresholds are defined as the horizontal translation required to produce a unit increase vertically along the functions in Fig. 1b. In other words, psychophysical performance measures the gradient of the contrast response function. These predictions are shown in Fig. 1c and have a characteristic ‘dipper’ shape, in which thresholds first decrease (facilitation), before increasing (masking). The mon and bin functions converge at higher pedestal levels, and the dich function shows strong threshold elevation (a slope of 1 on log-log axes) owing to the suppression between the two channels (when ω = 1). Again, this pattern of functions is consistent with those reported in psychophysical studies of binocular vision³⁵.

The present study uses two complementary methods – psychophysical AM depth discrimination, and steady-state auditory evoked potentials – to investigate binaural signal combination in the human brain. These methods allow us to measure the direct neural response, and also its perceptual correlates, for the binaural combination of AM stimuli. The results are compared with the predictions of the computational model^35,36 described above (see Fig. 1) and modifications to the model are discussed in the context of functional constraints on the human auditory system. Using a functional model of this type allows us to focus on the algorithm involved in binaural signal combination, without considering more generic properties of auditory processing such as the direct response to the carrier. This principled, model-driven approach positions our understanding of binaural summation in a broader context of work on sensory signal combination in the brain.

Methods

Apparatus & stimuli

Auditory stimuli were presented over Sennheiser (HD 280 pro) headphones (Sennheiser electronic GmbH, Wedemark, Germany), and had an overall presentation level of 80 dB SPL. An AudioFile device (Cambridge Research Systems Ltd., Kent, UK) was used to generate the stimuli with a sample rate of 44100 Hz. Stimuli consisted of a 1-kHz pure-tone carrier, amplitude modulated at a modulation frequency of either 40 Hz or 35 Hz (see Fig. 2), according to the equation:

$$w(t)={0.5}^{\ast }(1+{m}^{\ast }\,\cos ({{f}_{m}}^{\ast }{t}^{\ast }2\pi +\pi )){}^{\ast }\,\sin ({{f}_{c}}^{\ast }{t}^{\ast }2\pi )$$

(2)

where f_m is the modulation frequency in Hz, f_c is the carrier frequency in Hz, t is time in seconds, and m is the modulation depth, with a value from 0–1 (though hereafter expressed as a percentage, 100*m). The modulation frequencies were chosen as these produce robust steady state responses for auditory stimuli^19,20,21,25. We chose not to compensate for overall stimulus power (as is often done for AM stimuli, e.g³⁸.) for 3 reasons (see also³⁰). First, such compensation mostly affects AM detection thresholds at much higher modulation frequencies than we used here (e.g. see Figure A1 of³⁹). Second, compensation makes implicit assumptions about the cues used by the participant in the experiment, and we prefer to make any such cues explicit through computational modelling. Third, we confirmed in a control experiment that compensation had no systematic effect on thresholds (see Supplementary Fig. S1). The modulation depth and the assignments of modulation frequencies delivered to the left and right ears were varied parametrically across different conditions of the experiments.

EEG data were recorded with a sample frequency of 1 kHz using a 64-electrode Waveguard cap and an ANT Neuroscan (ANT Neuro, Netherlands) amplifier. Activity in each channel was referenced to the whole head average. Signals were digitised and stored on the hard drive of a PC for later offline analysis. Stimulus onset was coded on the EEG trace using digital triggers sent via a BNC cable directly from the AudioFile.

Psychophysical procedures

In the psychophysical discrimination experiment, participants heard two amplitude-modulated stimuli presented sequentially using a two-alternative-forced-choice (2AFC) design. The stimulus duration was 500 ms, with a 400 ms interstimulus interval (ISI) and a minimum inter-trial interval of 500 ms. One interval contained the standard stimulus, consisting of the pedestal modulation depth only. The other interval contained the signal stimulus, which comprised the pedestal modulation depth with an additional target increment.

The presentation order of the standard and signal intervals was randomised, and participants were instructed to indicate the interval which they believed contained the target increment using a two-button mouse. A coloured square displayed on the computer screen indicated accuracy (green for correct, red for incorrect). The size of the target increment was determined by a pair of 3-down-1-up staircases, with a step size of 3 dB (where dB units are defined as 20*log₁₀(100*m)), which terminated after the lesser of 70 trials or 12 reversals. The percentage of correct trials at each target modulation depth was used to fit a cumulative log-Gaussian psychometric function (using Probit analysis) to the data pooled across repetitions. We used this fit to estimate the target modulation that yielded a performance level of 75% correct, which was defined as the threshold. Each participant completed three repetitions of the experiment, producing an average of 223 trials per condition (and an average of 7133 trials in total per participant). This took around 5 hours in total per participant, and was completed across multiple days in blocks lasting around 10 minutes each.

Four binaural arrangements of target and pedestal were tested, at 8 pedestal modulation depths (100*m = 0, 1, 2, 4, 8, 16, 32 & 64). The arrangements are illustrated schematically in Fig. 2a, and were interleaved within a block at a single pedestal level, so that on each trial participants were not aware of the condition being tested. Note that in all conditions the carrier was presented to both ears, whether or not it was modulated by the pedestal and/or target. This avoids confounding the ears presented with the modulator with those presented with the carrier. In the monaural condition, the pedestal and target modulations were presented to one ear, with the other ear receiving only the unmodulated carrier. The modulated stimulus was assigned randomly to an ear on each trial. In the binaural condition, the pedestal and target modulations were presented to both ears (in phase). Comparison of the binaural and monaural conditions reveals the advantage of stimulating both ears with an AM stimulus, rather than only one. In the dichotic condition, the pedestal modulation was presented to one ear and the target modulation to the other ear. This allows the measurement of masking effects across the ears. Finally, in the half-binaural condition, the pedestal modulation was played to both ears, but the target modulation to only one ear. When compared with the binaural condition, this arrangement keeps the number of ears receiving the pedestal fixed, and changes only the number of ears receiving the target modulation. It therefore does not confound the effects of pedestal and target stimulation across the ears, and offers a more appropriate comparison than does the monaural condition³⁵. Note that for pedestal modulation depths of m = 0, and in the dichotic condition, the target increment was relative to the unmodulated carrier. Because the m = 0 detection condition was identical across the monaural, dichotic and half-binaural conditions, we pooled data across these conditions to obtain a more reliable estimate of threshold. In all conditions, the modulation frequency for the pedestal and the target was 40 Hz.

EEG procedure

In the EEG experiment, participants heard 11-s sequences of amplitude-modulated stimuli interspersed with silent periods of 3 seconds. There were five signal modulation depths (m = 6.25, 12.5, 25, 50 & 100%) and six binaural conditions, as illustrated in Fig. 2b. Each condition was repeated 10 times per participant. In the first three conditions, a single modulation frequency (40 Hz, F1) was used. In the monaural condition, the modulated ‘signal’ tone was presented to one ear, and the unmodulated carrier was presented to the other ear. In the binaural condition, the signal modulation was presented to both ears. In the dichotic condition, the signal modulation was presented to one ear, and a modulated masker with a modulation depth of m = 50% was presented to the other ear. These three conditions permit estimation of summation and gain control properties, as the use of the same modulation frequency in both ears means that signals to the left and right ears will sum.

The remaining three conditions involved modulation at a second modulation frequency (35 Hz, F2), in order to isolate suppressive processes between the ears²⁵. In the cross-monaural condition, F2 was presented to one ear as the signal, and the unmodulated carrier was presented to the other ear (F1 was not presented to either ear). This provides a comparison with the 40-Hz monaural condition, and also a baseline with which to compare the other cross-frequency conditions. In the cross-binaural condition, F1 was presented to one ear and F2 was presented to the other ear but the modulation depth of F1 and F2 was the same. This allows measurement of suppressive interactions between the ears without the complicating factor of signal summation at the same modulator frequency tag. In the cross-dichotic condition, F1 was presented to one ear, and F2 (m = 50%) was presented to the other ear. Again, we expect this condition to reveal suppressive interactions between the ears, as the F2 mask should suppress the F1 target, and reduce the amplitude of the response measured at 40 Hz.

The order of conditions was randomised, and each condition was repeated ten times, counterbalancing the presentation of stimuli to the left and right ears as required. Trials were split across 5 blocks, each lasting 14 minutes, with rest breaks between blocks. EEG data for each trial at each electrode were then analysed offline. The first 1000 ms following stimulus presentation was discarded to eliminate onset transients, and the remaining ten seconds were Fourier transformed and averaged coherently (taking into account the phase angle) across repetitions. This coherent averaging procedure minimises noise contributions (which have random phase across repetitions), and previous studies³⁶ have indicated that this renders artifact rejection procedures unnecessary. The dependent variables were the signal-to-noise ratios (SNR) at the Fourier components corresponding to the two modulation frequencies used in the experiment (40 Hz, F1 and 35 Hz, F2). These were calculated by dividing the amplitude at the frequency of interest (35 or 40 Hz) by the average amplitude in the surrounding 10 bins (±0.5 Hz in steps of 0.1 Hz). The absolute SNRs (discarding phase information) were then used to average across participants.

Participants

Six adult participants (two male; age range 22–40) completed the psychophysics experiment, and twelve adult participants (3 male; age range 20–33) completed the EEG experiment. All had self-reported normal hearing, and provided written informed consent. Experimental procedures were approved by the ethics committee of the Department of Psychology, University of York, and were carried out in accordance with relevant guidelines and regulations.

Data and code sharing

Data and analysis scripts are available online at: https://dx.doi.org/10.17605/OSF.IO/KV2TM

Results

Discrimination results are consistent with weak interaural suppression

The results of the AM depth discrimination experiment are shown in Fig. 3 averaged across 6 participants. A 4 (condition) x 8 (pedestal level) repeated measures ANOVA found significant main effects of condition (F = 47.46, p < 0.01, η_G² = 0.32) and pedestal level (F = 10.77, p < 0.01, η_G² = 0.58), and a significant interaction between the two factors (F = 8.64, p < 0.01, η_G² = 0.34). When the results are plotted as increment thresholds on logarithmic axes, the results for binaurally presented modulations (red squares in Fig. 3a) followed a ‘dipper’ shape⁴⁰, with thresholds decreasing from an average of around 6% at detection threshold to around 2% on a pedestal of 8% (a facilitation effect). At higher pedestal modulations, thresholds increased to more than 16%, indicating a masking effect. Thresholds for the monaural modulation (blue circles in Fig. 3a) followed a similar pattern, but were shifted vertically by an average factor of 1.90 across all pedestal levels. The monaural and binaural dipper handles remained apart, and were approximately parallel, at higher pedestal modulation depths. At detection threshold (pedestal m = 0), the average summation between binaural and monaural modulation (e.g. the vertical offset between the leftmost points in Fig. 3a) was a factor of 1.67 (4.47 dB). This level of summation is above that typically expected from probabilistic combination of independent inputs⁴¹, and implies the presence of physiological summation between the ears.

Dichotic presentation (pedestal modulation in one ear and target modulation in the other) elevated thresholds very slightly, by a factor of 1.19 at the highest pedestal modulation depths (green diamonds in Fig. 3a), compared to baseline (0% pedestal modulation). This masking effect was substantially weaker than is typically observed for dichoptic pedestal masking in vision (see Fig. 1a), which can elevate thresholds by around a factor of 30³⁵. The thresholds for the half-binaural condition (orange triangles in Fig. 3a–d), where the pedestal was presented to both ears, but the target only to one ear, was not appreciably different from that for the monaural condition, with thresholds greater than in the binaural condition by a factor of 1.94 on average.

These results can be converted to Weber fractions by dividing the threshold increments by the pedestal modulation depths, for pedestals > 0%. These values are shown for the average data in Fig. 3b. At lower pedestal modulation depths (< 8%), Weber fractions decreased with increasing pedestal level. At pedestal modulations above 8%, the binaural Weber fractions (red squares) plateaued at around 0.25, whereas the monaural and half-binaural Weber fractions (blue circles and orange triangles) plateaued around 0.5. The dichotic Weber fractions (green diamonds) continued to decrease throughout. Thus, non-Weber behaviour occurred over the lower range of pedestal modulations depths, but more traditional Weber-like behaviour was evident at higher pedestal levels. The exception is the dichotic condition, where non-Weber behaviour was evident throughout. A further way to represent these thresholds is to add the pedestal contrast to each threshold; the results are presented using this convention in Supplementary Fig. S2.

Overall, this pattern of results is consistent with a weak level of interaural suppression between the left and right ears. This accounts for the lack of convergence of monaural and binaural dipper functions at high pedestal levels, and the relatively minimal threshold elevation in the dichotic masking condition, as we will show in greater detail through computational modelling below. Our second experiment sought to measure modulation response functions directly using steady-state EEG to test whether this weak suppression is also evident in cortical responses.

Direct neural measures of binaural combination

Steady-state EEG signals were evident over central regions of the scalp, at both modulation frequencies tested, and for both monaural and binaural modulations (Fig. 4). In particular, there was no evidence of laterality effects for monaural presentation to one or other ear. We therefore averaged steady-state SNRs across a region-of-interest (ROI) comprising nine fronto-central electrodes (Fz, F1, F2, FCz, FC1, FC2, Cz, C1, C2, highlighted white in Fig. 4) to calculate modulation response functions.

We conducted separate 6 (condition) x 5 (modulation depth) repeated measures ANOVAs at each modulation frequency using the SNRs averaged across the ROI. At 40 Hz, we found significant main effects of condition (F = 38.83, p < 0.001, η_G² = 0.54, Greenhouse-Geisser corrected) and modulation depth (F = 33.22, p < 0.001, η_G² = 0.43, Greenhouse-Geisser corrected), and a significant interaction between the two variables (F = 6.13, p < 0.001, η_G² = 0.19). At 35 Hz, we also found significant main effects of condition (F = 17.40, p < 0.01, η_G² = 0.45, Greenhouse-Geisser corrected) and modulation depth (F = 8.72, p < 0.001, η_G² = 0.07), and a significant interaction between the two variables (F = 5.64, p < 0.001, η_G² = 0.17, Greenhouse-Geisser corrected). We then performed Bonferroni-corrected post hoc tests to compare specific conditions of interest, which are reported in the following sections.

SNRs are plotted as a function of modulation depth in Fig. 5. For a single modulation frequency (40 Hz), responses increased monotonically with increasing modulation depth, with SNRs >2 evident for modulation depths above 12.5%. One-sample t-tests for both monaural and binaural conditions showed that SNRs were significantly >1 for modulation depths of 25% and above (monaural) and 12.5% and above (binaural) (all p < 0.05). Binaural presentation (red squares in Fig. 5a) led to SNRs of around 6.5 at the highest modulation depth, whereas monaural modulation produced weaker signals of SNR~5 (blue circles in Fig. 5a). Paired t-tests comparing monaural and binaural responses were significant at modulation depths of 25% and 50% (all p < 0.05), but not for 6.25%, 12.5% and 100% modulation depths. The finding that monaural and binaural functions do not converge at high modulation depths suggests that interaural suppression is too weak to fully normalise the response to two inputs compared with one. This is because in models with strong interaural suppression (e.g. Figure 1b), the extra excitation in the binaural condition is balanced by extra suppression, and response increases are minimal. The significantly higher binaural responses seen here imply that excitation exceeds inhibition.

In the dichotic condition (green diamonds in Fig. 5a), a masker with a fixed 50% modulation depth presented to one ear produced an SNR of 4 when the unmodulated carrier was presented to the other ear (see left-most point). As the dichotic signal modulation increased, responses increased to match the binaural condition at higher signal modulations (red squares and green diamonds converge in Fig. 5a). Post hoc paired t-tests comparing these conditions were significant at the lowest modulation depth (6.25% p = 0.0001), but not at the four higher modulation depths (all p > 0.05).

When the carrier presented to one ear was modulated at a different frequency (35 Hz), several differences were apparent for the three conditions. Monaural modulation at 35 Hz (the cross-mon condition) evoked no measureable responses at 40 Hz as expected (orange circles in Fig. 5b; all p > 0.05 for comparison to SNR = 1). At the modulation frequency of 35 Hz, this condition produced a monotonically increasing function peaking around SNR = 3.5 (orange circles in Fig. 5c; t-tests significant at p < 0.01 for modulation depths of 25% and higher). Binaural modulation with different modulation frequencies in each ear led to weaker responses (SNRs of 4 at 40 Hz and 3 at 35 Hz; purple triangles in Fig. 5b,c) than for binaural modulation at the same frequency (SNR = 7, red squares in Fig. 5a; paired t-tests comparing binaural and cross-binaural conditions at 40 Hz were significant at modulation depths of 12.5% and higher, p < 0.05). A 35-Hz AM masker with a fixed 50% modulation depth presented to one ear produced little change in the response to a signal in the other ear, which was amplitude-modulated with a modulation frequency of 40 Hz (grey inverted triangles in Fig. 5b; t-tests comparing monaural and cross-dichaural conditions at 40 Hz were not significant at any modulation depth, p > 0.05), though increasing the signal modulation depth slightly reduced the neural response to the 35-Hz AM masker (grey inverted triangles in Fig. 5c; t-tests comparing each response to the response at a modulation depth of 6.25% showed a significant reduction for modulation depths of 12.5%, 50% and 100%, p < 0.01). This weak dichotic masking effect is further evidence of weak interaural suppression. We next consider model arrangements that are able to explain these results.

A single model of signal combination predicts psychophysics and EEG results

To further understand our results, we fit the model described by Eq. 1 to both data sets. To fit the psychophysical data, we calculated the model response to the pedestal, and then determined the target modulation depth that was necessary to increase this response by a fixed value, σ₄₀, which was a fifth free parameter in the model (the other four free parameters being p, q, Z and ω; note that all parameters were constrained to be positive, q was constrained to always be greater than 2 to ensure that the nonlinearity was strong enough to produce a dip, and we ensured that p > q). With five free parameters, the data were described extremely well (see Fig. 6a), with a root mean square error (RMSE, calculated as the square root of the mean squared error between model and data points across all conditions displayed in a figure panel) of 1.2 dB, which compares favourably to equivalent model fits in vision experiments³⁵. However, the value of the interaural suppression parameter was much less than 1 (ω = 0.02, see Table 1). This weak interaural suppression changes the behaviour of the canonical model shown in Fig. 1c in two important ways, both of which are consistent with our empirical results. First, the degree of threshold elevation in the dichotic condition is much weaker, as is clear in the data (green diamonds in Figs. 3a and 6a). Second, the thresholds in the monaural condition are consistently higher than those in the binaural condition, even at high pedestal levels (compare blue circles and red squares in Fig. 3a and 6a).

Table 1 Parameters for the model fits shown in Figs. 6 and 7 with parameter constraints as described in the text.

Full size table

To illustrate how the model behaves with stronger interaural suppression, we increased the weight to a value of ω = 1, but left the other parameters fixed at the values from the previous fit. This manipulation (shown in Fig. 6b) reversed the changes caused by the weaker suppression – masking became stronger in the dichotic condition, and the monaural and binaural dipper functions converged at the higher pedestal levels. These changes provided a poorer description of human discrimination performance, with the RMSE increasing from 1.2 dB to 5.5 dB. Finally, we held suppression constant (at ω = 1), but permitted the other four parameters to vary in the fit. This somewhat improved the fit (see Fig. 6c), but retained the qualitative shortcomings associated with strong interaural suppression, and only slightly improved the RMSE (from 5.5 dB to 4.3 dB).

To fit the EEG data, we converted the model response to an SNR by adding the noise parameter (σ) to the model response, and then scaling by the noise parameter (e.g. (resp + σ)/σ). Because maximum SNRs varied slightly across the two modulation frequencies (40 and 35 Hz, see Fig. 5), we permitted this noise parameter to take a different value at each frequency (σ₄₀ and σ₃₅). Model predictions for the conditions described in Fig. 2b are shown in Fig. 6d for a version of the model with six free parameters. This produced an excellent fit (comparable to those for visual signals³⁶), which included the main qualitative features of the empirical amplitude response functions, with an RMSE of 0.34. The model captures the increased response to binaural modulations compared with monaural modulations (blue circles vs red squares in Fig. 6d), the relatively modest suppression in the cross-bin (purple triangles) and cross-dichotic (grey triangles) conditions at 40 Hz relative to the monaural condition, and the gentle decline in SNR in the cross dichotic condition at the masker frequency (black triangles in Fig. 6d). Most parameters took on comparable values to those for the dipper function fits described above (see Table 1). Of particular note, the weight of interaural suppression remained weak (ω = 0.14).

We again explored the effect of increasing the weight of suppression (to ω = 1) whilst keeping the other parameters unchanged. This resulted in a reduction of amplitudes in the binaural and cross-binaural conditions, which worsened the fit (to an RMSE of 0.96). Permitting all other parameters (apart from ω) to vary freely improved the fit (to RMSE = 0.51), but there were still numerous shortcomings. In particular the monaural and binaural response functions were more similar than in the data, and the reduction in SNR in the cross-binaural and cross-dichotic conditions was more extensive than found empirically.

Our modelling of the data from two experimental paradigms therefore support the empirical finding that interaural suppression is relatively weak (by around an order of magnitude) compared with analogous interocular suppression phenomena in vision^35,36.

Discussion

We have presented converging evidence from two experimental paradigms (psychophysics and steady-state EEG) concerning the functional architecture of the human binaural auditory system. A single computational model, in which signals from the two ears inhibit each other weakly before being combined, provided the best description of data sets from both experiments. This model architecture originates from work on binocular vision, showing a commonality between these two sensory systems. We now discuss these results in the context of related empirical results, previous binaural models, and ecological constraints that differentially affect vision and hearing.

A unified framework for understanding binaural processing

Our psychophysical experiment replicates the classical finding^1,6,7,8,9 of approximately 3 dB of binaural summation at detection threshold (here 4.47 dB) for amplitude-modulated stimuli. This is very similar to values previously reported for binocular summation of contrast in the visual system, where summation ratios of 3–6 dB are typical⁴². Above threshold, this difference persisted, with monaural stimulation producing higher discrimination thresholds (Fig. 3) and weaker EEG responses (Figs. 4, 5) than binaural stimulation. This is consistent with previous EEG work^17,22, and also the finding that perceived loudness and modulation depth are higher for binaural than monaural presentation^{11,12,13,14,28}. However, these auditory effects are dramatically different from the visual domain, where both discrimination performance and perceived contrast are largely independent of the number of eyes stimulated³⁴. We discuss possible reasons for this modality difference below.

Suppression between the ears has been measured previously with steady-state EEG and MEG (magnetoencephalography) using amplitude-modulated stimuli with frequencies that are the same^25,43 or different^23,24,25 in the left and right ears. When the same frequency is used in both ears, suppression can be assessed by comparing binaural responses to the linear sum of two monaural responses. When the measured binaural response is weaker than this prediction, this is taken as evidence of suppression between the ears (though we note that nonlinear transduction might produce similar effects). Tiihonen et al.⁴³ used 500 ms click trains at 40 Hz, and found evidence for strong suppression of the initial evoked N100 amplitudes, but weaker suppression of the 40 Hz response (especially relative to ipsilateral stimuli). If suppression decreased even further for longer presentations (as used here), this might explain why suppression appears so weak in our study. Alternatively, the Tiihonen study used laterally placed MEG sensors to record signals from auditory cortex, whereas we used EEG with a central region of interest, which might also account for the differences. Of course a strong signal at the vertex does not mean that the signals are coming from adjacent cortex – source localisation studies have shown lateralised generators in auditory cortex^15,44, but because of cortical folding the signals are strongest at the vertex using EEG (at least when using a whole head or earlobe reference; other studies that reference to Cz show greater activity at parietal-temporal and occipital electrodes^22,25).

Other studies^23,24,25 used different frequency tags in the two ears in conditions analogous to our cross-binaural and cross-dichaural conditions. For tag frequencies around 20 Hz, there were varying amounts of suppression between 36% and 72% of the monaural response depending on whether signals were measured from the left or right hemisphere, and whether they were for ipsilateral or contralateral presentations²³. Two other studies^24,25 used frequencies around 40 Hz, and again found a range of suppression strengths depending on laterality and hemisphere. The weakest suppressive effects were comparable to those measured here using steady-state EEG (see Fig. 5b,c). It is possible that different stages of processing might involve different amounts of suppression, which would require the use of techniques with better spatial precision to localise responses of interest to specific brain regions.

Another widely-studied phenomenon that might involve suppression between the ears is the binaural masking level difference (BMLD⁴⁵). In this paradigm, a signal embedded in noise is detected more easily when either the signal or the noise in one ear is inverted in phase⁴⁶. Contemporary explanations of this effect^47,48 invoke cross-correlation of binaural signals, but lack explicit inhibition between masker and test signals. However, more elaborate versions of the model described here include mechanisms tuned to opposite spatial phases of sine-wave grating stimuli⁴⁹, and a similar approach in the temporal domain might be capable of predicting BMLD effects. Alternatively, since the BMLD phenomenon often involves segmentation of target and masker, it might be more akin to ‘unmasking’ effects that occur in vision when stimuli are presented in different depth planes^50,51. Modelling such effects would likely require additional mechanisms representing different spatial locations, far beyond the scope of the architecture proposed here.

It is worth at this point making brief statement regarding the terminology used throughout the literature. We have referred to a condition in which a modulated carrier is presented to one ear and an unmodulated carrier to the other ear as a ‘monaural’ condition. Our rationale here is that the amplitude modulation is the signal, and we are interested in whether this signal is presented to one or both ears. Presenting the carrier to the other ear keeps the arrangement balanced, and avoids confounding the number of ears receiving the modulator with the number of ears receiving the carrier. However, other studies²⁵ have referred to this condition as ‘dichotic’, on the basis that the unmodulated carrier is a type of signal. In these studies, the monaural condition involves presenting silence to the unstimulated ear. These arrangements do produce different levels of activity, with steady-state responses being larger in the condition where the unmodulated ear receives no carrier²⁵. Because our model does not explicitly represent the carrier, these differences are not predicted. However they could in principle be modelled by adjusting the saturation constant (the Z parameter in Eq. 1) to reflect a suppressive contribution from the carrier in the opposite ear. In vision, this is analogous to ‘zero-frequency’ dichoptic masking from the mean luminance of the display, which can also have a weak suppressive effect⁵². Ultimately we think the precise choice of terminology is less important than ensuring a well-balanced experimental design, with clearly defined conditions (see Fig. 2).

The model shares features with previous binaural models

Previous models of binaural processing^53,54,55 have some architectural similarities to the model shown in Fig. 1a. For example, binaural inhibition is a common feature⁵⁵, often occurring across multiple timescales⁵³. However these models are typically designed with a focus on explaining perception across a range of frequencies (and for inputs of arbitrary frequency content), rather than attempting to understand performance on specific tasks (i.e. AM depth discrimination) or the precise mapping between stimulus and cortical response (i.e. the amplitude response functions measured using steady-state EEG). At threshold, one model⁵⁴ predicts minimal levels of binaural summation (~1 dB) in line with probabilistic combination of inputs but below that found experimentally. These models would therefore likely require modification (i.e. the inclusion of physiological summation and early nonlinearities) to explain the data here, though it is possible that such modifications could be successful, given the other similarities between the models.

Several previous neural models of binaural processing have focussed on excitatory and inhibitory processes of neurons in subcortical auditory structures such as the lateral superior olive. These models (reviewed in⁴⁷) are concerned with lateralised processing, in which interaural interactions are purely inhibitory, and so do not typically feature excitatory summation. However, models of inferior colliculus neurons do typically involve binaural summation, and have the same basic structure as the architecture shown in Fig. 1a. In general these models are designed to explain responses to binaurally asynchronous stimuli (where stimuli reach the two ears at different times), and so typically feature asymmetric delays across the excitatory and inhibitory inputs from the two ears⁵⁶. Since a time delay is not a critical component of the divisive suppression on the denominator of Eq. 1, and because a mechanism with broad temporal tuning is equivalent to the envelope of many mechanisms with different delays, the architecture proposed here can be considered a generalised case of such models.

A linear summation model gives a poorer fit to both data sets

A more straightforward approach than the model we test above is to sum amplitude modulation signals linearly across the ears. This approach has been successful in previous studies on the perception of amplitude-modulated tones²⁸, and has some plausibility given the high levels of binaural summation found at threshold in our psychophysical results (a summation ratio of 1.67). We attempted to fit a model incorporating linear summation, followed by a (binaural) nonlinear transducer as follows:

$$resp=\frac{{({C}_{L}+{C}_{R})}^{p}}{{Z}^{q}+{({C}_{L}+{C}_{R})}^{q}}$$

(3)

with all terms consistent with those in Eq. 1, and fitted parameters given in Table 1 (note this model lacks inhibition between the ears, so the weight parameter ω is omitted). For the discrimination data (see Fig. 7a and Supplementary Fig. 2d), the monaural and dichotic conditions become equivalent in this model (since the ear that pedestal and target are presented to is irrelevant). The linear summation model predicts that the monaural dip (blue curve) is shifted rightwards compared to the binaural and half-binaural dips (red and orange curves), which does not occur in the data. Furthermore, a substantial dipper is predicted for the dichotic condition (green dashed curve), which is also not consistent with our empirical results. Overall, the linear summation model gives a poorer numerical fit (RMS error of 2.4 dB) than the signal combination model (RMS error of 1.2 dB) for the dipper data. The shortcomings for modelling the EEG data are more subtle (Fig. 7b), as the model does quite a good job describing the monaural, binaural and dichotic conditions, and the masking effects in the cross conditions are weak in the data and absent in the model. Nevertheless, the fit was poorer numerically (RMS errors of 0.34 vs 0.48), leading us to conclude that the signal combination model provides a better description of both data sets.

Ecological constraints on vision and hearing

This study reveals an important and striking difference between hearing and vision – suppression between the ears is far weaker than suppression between the eyes. Why should this be so? In the visual domain, the brain attempts to construct a unitary percept of the visual environment from two overlapping inputs, termed binocular single vision. For weak signals (at detection threshold) it is beneficial to sum the two inputs to improve the signal-to-noise ratio. But above threshold, there is no advantage for a visual object to appear more intense when viewed with two eyes compared with one. The strong interocular suppression prevents this from occurring by normalizing the signals from the left and right eyes to achieve ‘ocularity invariance’ – the constancy of perception through one or both eyes³⁴. The guiding principle here may be that the brain is reducing redundancy in the sensory representation by avoiding multiple representations of a single object.

In the human auditory system the ears are placed laterally, maximising the disparity between the signals received (and minimising overlap). This incurs benefits when determining the location of lateralised sound sources, though reporting the location of pure tone sources at the midline (i.e. directly in front or behind) is very poor². Hearing a sound through both ears at once therefore does not necessarily provide information that it comes from a single object, and so the principle of invariance should not be applied (and interaural suppression should be weak). However other cues that are consistent with a single auditory object (for example interaural time and level differences consistent with a common location) should result in strong suppression to reduce redundant representations, and cues that signals come from multiple auditory objects should release that suppression. This is the essence of the BMLD effects discussed above – suppression is strongest when target and masker have the same phase offsets (consistent with a common source), and weakest when their phase offsets are different. The distinct constraints placed on the visual and auditory systems therefore result in different requirements, which are implemented in a common architecture by changing the weight of suppression between channels.

Conclusions

A combination of psychophysical and electrophysiological experiments, and computational modelling have converged on an architecture for the binaural summation of amplitude-modulated tones. This architecture is identical to the way that visual signals are combined across the eyes, with the exception that the weight of suppression between the ears is weaker than that between the eyes. This is likely because the ecological constraints governing suppression of multiple sources aim to avoid signals from a common source being over-represented. Such a high level of consistency across sensory modalities is unusual, and illustrates how the brain can adapt generic neural circuits to meet the demands of a specific situation.

References

Shaw, W. A., Newman, E. B. & Hirsh, I. J. The difference between monaural and binaural thresholds. J. Exp. Psychol. 37, 229–242 (1947).
Article CAS PubMed Google Scholar
Rayleigh, L. XII. On our perception of sound direction. Philos. Mag. Ser. 6(13), 214–232 (1907).
Article Google Scholar
Kolarik, A. J., Cirstea, S., Pardhan, S. & Moore, B. C. J. A summary of research investigating echolocation abilities of blind and sighted humans. Hear. Res. 310, 60–68 (2014).
Article PubMed Google Scholar
Thaler, L., Arnott, S. R. & Goodale, M. A. Neural Correlates of Natural Human Echolocation in Early and Late Blind Echolocation Experts. PLoS ONE 6, e20162 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Rowan, D., Papadopoulos, T., Edwards, D. & Allen, R. Use of binaural and monaural cues to identify the lateral position of a virtual object using echoes. Hear. Res. 323, 32–39 (2015).
Article PubMed Google Scholar
Babkoff, H. & Gombosh, D. Monaural and binaural temporal integration of noise bursts. Psychol. Res. 39, 137–145 (1976).
Article CAS PubMed Google Scholar
Heil, P. Towards a Unifying Basis of Auditory Thresholds: Binaural Summation. J. Assoc. Res. Otolaryngol. 15, 219–234 (2014).
Article PubMed PubMed Central Google Scholar
Pollack, I. Monaural and Binaural Threshold Sensitivity for Tones and for White Noise. J. Acoust. Soc. Am. 20, 52–57 (1948).
Article ADS Google Scholar
Hirsh, I. J. Binaural summation; a century of investigation. Psychol. Bull. 45, 193–206 (1948).
Article CAS PubMed Google Scholar
Jesteadt, W. & Wier, C. C. Comparison of monaural and binaural discrimination of intensity and frequency. J. Acoust. Soc. Am. 61, 1599–1603 (1977).
Article ADS CAS PubMed Google Scholar
Fletcher, H. & Munson, W. A. Loudness, Its Definition, Measurement and Calculation. J. Acoust. Soc. Am. 5, 82–108 (1933).
Article ADS Google Scholar
Hellman, R. P. & Zwislocki, J. Monaural Loudness Function at 1000 cps and Interaural Summation. J. Acoust. Soc. Am. 35, 856–865 (1963).
Article ADS Google Scholar
Reynolds, G. S. & Stevens, S. S. Binaural Summation of Loudness. J. Acoust. Soc. Am. 32, 1337–1344 (1960).
Article ADS Google Scholar
Treisman, M. & Irwin, R. J. Auditory Intensity Discriminal Scale I. Evidence Derived from Binaural Intensity Summation. J. Acoust. Soc. Am. 42, 586–592 (1967).
Article ADS CAS PubMed Google Scholar
Mäkelä, J. P. & Hari, R. Evidence for cortical origin of the 40 Hz auditory evoked response in man. Electroencephalogr. Clin. Neurophysiol. 66, 539–546 (1987).
Article PubMed Google Scholar
Picton, T. W., Vajsar, J., Rodriguez, R. & Campbell, K. B. Reliability estimates for steady-state evoked potentials. Electroencephalogr. Clin. Neurophysiol. Potentials Sect. 68, 119–131 (1987).
Article CAS Google Scholar
Rees, A., Green, G. G. & Kay, R. H. Steady-state evoked responses to sinusoidally amplitude-modulated sounds recorded in man. Hear. Res. 23, 123–133 (1986).
Article CAS PubMed Google Scholar
Ross, B., Borgmann, C., Draganova, R., Roberts, L. E. & Pantev, C. A high-precision magnetoencephalographic study of human auditory steady-state responses to amplitude-modulated tones. J. Acoust. Soc. Am. 108, 679–691 (2000).
Article ADS CAS PubMed Google Scholar
Galambos, R., Makeig, S. & Talmachoff, P. J. A 40-Hz auditory potential recorded from the human scalp. Proc. Natl. Acad. Sci. 78, 2643–2647 (1981).
Article ADS CAS PubMed PubMed Central Google Scholar
Kuwada, S., Batra, R. & Maher, V. L. Scalp potentials of normal and hearing-impaired subjects in response to sinusoidally amplitude-modulated tones. Hear. Res. 21, 179–192 (1986).
Article CAS PubMed Google Scholar
Farahani, E. D., Goossens, T., Wouters, J. & van Wieringen, A. Spatiotemporal reconstruction of auditory steady-state responses to acoustic amplitude modulations: Potential sources beyond the auditory pathway. NeuroImage 148, 240–253 (2017).
Article PubMed Google Scholar
Poelmans, H., Luts, H., Vandermosten, M., Ghesquière, P. & Wouters, J. Hemispheric Asymmetry of Auditory Steady-State Responses to Monaural and Diotic Stimulation. J. Assoc. Res. Otolaryngol. 13, 867–876 (2012).
Article PubMed PubMed Central Google Scholar
Fujiki, N., Jousmaki, V. & Hari, R. Neuromagnetic responses to frequency-tagged sounds: a new method to follow inputs from each ear to the human auditory cortex during binaural hearing. J. Neurosci. 22, RC205 (2002).
Kaneko, K., Fujiki, N. & Hari, R. Binaural interaction in the human auditory cortex revealed by neuromagnetic frequency tagging: no effect of stimulus intensity. Hear. Res. 183, 1–6 (2003).
Article PubMed Google Scholar
Gransier, R., van Wieringen, A. & Wouters, J. Binaural Interaction Effects of 30–50 Hz Auditory Steady State Responses. Ear Hear. 38, e305–e315 (2017).
Article PubMed Google Scholar
Danilenko, L. Binaural hearing in non-stationary diffuse sound field (in German). Kybernetik 6, 50–57 (1969).
CAS PubMed Google Scholar
Zahorik, P. et al. Amplitude modulation detection by human listeners in reverberant sound fields: Carrier bandwidth effects and binaural versus monaural comparison. Proc. Mtgs. Acoust. 15, 050002, https://doi.org/10.1121/1.4733848 (2013).
Ozimek, E., Konieczny, J. & Sone, T. Binaural perception of the modulation depth of AM signals. Hear. Res. 235, 125–133 (2008).
Article PubMed Google Scholar
Rutschmann, J. & Rubinstein, L. Binaural Beats and Binaural Amplitude‐Modulated Tones: Successive Comparison of Loudness Fluctuations. J. Acoust. Soc. Am. 38, 759–768 (1965).
Article ADS CAS PubMed Google Scholar
Wojtczak, M. & Viemeister, N. F. Intensity discrimination and detection of amplitude modulation. J. Acoust. Soc. Am. 106, 1917–1924 (1999).
Article ADS CAS PubMed Google Scholar
Schlittenlacher, J. & Moore, B. C. J. Discrimination of amplitude-modulation depth by subjects with normal and impaired hearing. J. Acoust. Soc. Am. 140, 3487–3495 (2016).
Article ADS PubMed Google Scholar
Wakefield, G. H. & Viemeister, N. F. Discrimination of modulation depth of sinusoidal amplitude modulation (SAM) noise. J. Acoust. Soc. Am. 88, 1367–1373 (1990).
Article ADS CAS PubMed Google Scholar
McGill, W. J. & Goldberg, J. P. A study of the near-miss involving Weber’s law and pure-tone intensity discrimination. Percept. Psychophys. 4, 105–109 (1968).
Article Google Scholar
Baker, D. H., Meese, T. S. & Georgeson, M. A. Binocular interaction: contrast matching and contrast discrimination are predicted by the same model. Spat. Vis. 20, 397–413 (2007).
Article PubMed Google Scholar
Meese, T. S., Georgeson, M. A. & Baker, D. H. Binocular contrast vision at and above threshold. J. Vis. 6(11), 7, 1224–1243 https://doi.org/10.1167/6.11.7 (2006).
Article Google Scholar
Baker, D. H. & Wade, A. R. Evidence for an Optimal Algorithm Underlying Signal Combination in Human Visual Cortex. Cereb. Cortex 27, 254–264 (2017).
PubMed Google Scholar
Marr, D. & Poggio, T. From understanding computation to understanding neural circuitry. In Artificial Intelligence Laboratory. A.I. Memo. vol. AIM-357 (Massachusetts Institute of Technology, 1976).
Ewert, S. D. & Dau, T. External and internal limitations in amplitude-modulation processing. J. Acoust. Soc. Am. 116, 478–490 (2004).
Article ADS PubMed Google Scholar
Viemeister, N. F. Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364–1380 (1979).
Article ADS CAS PubMed Google Scholar
Legge, G. E. & Foley, J. M. Contrast masking in human vision. J. Opt. Soc. Am. 70, 1458–1471 (1980).
Article ADS CAS PubMed Google Scholar
Tyler, C. W. & Chen, C. C. Signal detection theory in the 2AFC paradigm: attention, channel uncertainty and probability summation. Vision Res. 40, 3121–3144 (2000).
Article CAS PubMed Google Scholar
Baker, D. H., Lygo, F. A., Meese, T. S. & Georgeson, M. A. Binocular summation revisited: Beyond √2. Psychol. Bull. 144, 1186–1199 (2018).
Article PubMed PubMed Central Google Scholar
Tiihonen, J., Hari, R., Kaukoranta, E. & Kajola, M. Interaural interaction in the human auditory cortex. Audiol. 28, 37–48 (1989).
McFadden, K. L. et al. Test-Retest Reliability of the 40 Hz EEG Auditory Steady-State Response. PLoS ONE 9, e85748 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Webster, F. A. The Influence of Interaural Phase on Masked Thresholds I. The Role of Interaural Time‐Deviation. J. Acoust. Soc. Am. 23, 452–462 (1951).
Article ADS Google Scholar
Hirsh, I. J. The Influence of Interaural Phase on Interaural Summation and Inhibition. J. Acoust. Soc. Am. 20, 536–544 (1948).
Article ADS Google Scholar
Colburn, H. S. Computational Models of Binaural Processing. In Auditory Computation (eds. Hawkins, H. L., McMullen, T. A., Popper, A. N. & Fay, R. R.) vol. 6 332–400 (Springer New York (1996).
Gilbert, H. J., Shackleton, T. M., Krumbholz, K. & Palmer, A. R. The Neural Substrate for Binaural Masking Level Differences in the Auditory Cortex. J. Neurosci. 35, 209–220 (2015).
Article CAS PubMed PubMed Central Google Scholar
Georgeson, M. A., Wallis, S. A., Meese, T. S. & Baker, D. H. Contrast and lustre: A model that accounts for eleven different forms of contrast discrimination in binocular vision. Vision Res. 129, 98–118 (2016).
Article PubMed Google Scholar
Moraglia, G. & Schneider, B. Effects of direction and magnitude of horizontal disparities on binocular unmasking. Perception 19, 581–593 (1990).
Article CAS PubMed Google Scholar
Wardle, S. G., Cass, J., Brooks, K. R. & Alais, D. Breaking camouflage: Binocular disparity reduces contrast masking in natural images. J. Vis. 10(14), 38, https://doi.org/10.1167/10.14.38 (2010).
Article PubMed Google Scholar
Yang, J. & Stevenson, S. B. Post-retinal processing of background luminance. Vision Res. 39, 4045–4051 (1999).
Article CAS PubMed Google Scholar
Moore, B. C. J., Glasberg, B. R., Varathanathan, A. & Schlittenlacher, J. A Loudness Model for Time-Varying Sounds Incorporating Binaural Inhibition. Trends Hear. 20, 233121651668269 (2016).
Article Google Scholar
Moore, B. C. J. & Glasberg, B. R. Modeling binaural loudness. J. Acoust. Soc. Am. 121, 1604–1612 (2007).
Article ADS PubMed Google Scholar
Breebaart, J., van de Par, S. & Kohlrausch, A. Binaural processing model based on contralateral inhibition. I. Model structure. J. Acoust. Soc. Am. 110, 1074–1088 (2001).
Article ADS CAS PubMed Google Scholar
Sujaku, Y., Kuwada, S. & Yin, T. C. T. Binaural Interaction in the Cat Inferior Colliculus: Comparison of the Physiological Data with a Computer Simulated Model. In Neuronal Mechanisms of Hearing (eds. Syka, J. & Aitkin, L.) 233–238 (Springer US, 1981). https://doi.org/10.1007/978-1-4684-3908-3_24.
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by the Royal Society (grant number RG130121 to DHB) and the Wellcome Trust (grant number 213616/Z/18/Z to AB). This work was also supported by the NIHR Manchester Biomedical Research Centre.

Author information

Authors and Affiliations

Department of Psychology, University of York, Heslington, York, YO10 5DD, UK
D. H. Baker, G. Vilidaite, E. McClarnon, E. Valkova & A. Bruno
York Biomedical Research Institute, University of York, Heslington, York, YO10 5DD, UK
D. H. Baker
School of Psychology, University of Southampton, University Road, Southampton, SO17 1BJ, UK
G. Vilidaite
Manchester Centre for Audiology and Deafness, University of Manchester, Manchester, M13 9PL, UK
R. E. Millman
NIHR Manchester Biomedical Research Centre, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
R. E. Millman

Authors

D. H. Baker
View author publications
You can also search for this author in PubMed Google Scholar
G. Vilidaite
View author publications
You can also search for this author in PubMed Google Scholar
E. McClarnon
View author publications
You can also search for this author in PubMed Google Scholar
E. Valkova
View author publications
You can also search for this author in PubMed Google Scholar
A. Bruno
View author publications
You can also search for this author in PubMed Google Scholar
R. E. Millman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.H.B.: conceived and designed the study, supervised data acquisition, performed analyses, wrote the paper. G.V.: contributed to data acquisition, revised the paper. E.M.: contributed to study design and data acquisition. EV: contributed to study design and data acquisition. A.B.: contributed to data acquisition, revised the paper. R.E.M.: contributed to study design and data acquisition, revised the paper. All authors approved the submitted version, and agree to be accountable for their contribution to the work.

Corresponding author

Correspondence to D. H. Baker.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Materials.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons License, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons License, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons License and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this License, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Baker, D.H., Vilidaite, G., McClarnon, E. et al. Binaural summation of amplitude modulation involves weak interaural suppression. Sci Rep 10, 3560 (2020). https://doi.org/10.1038/s41598-020-60602-5

Download citation

Received: 28 August 2019
Accepted: 10 February 2020
Published: 26 February 2020
DOI: https://doi.org/10.1038/s41598-020-60602-5

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.