Visual modulation of auditory evoked potentials in the cat

Visual modulation of the auditory system is not only a neural substrate for multisensory processing, but also serves as a backup input underlying cross-modal plasticity in deaf individuals. Event-related potential (ERP) studies in humans have provided evidence of a multiple-stage audiovisual interactions, ranging from tens to hundreds of milliseconds after the presentation of stimuli. However, it is still unknown if the temporal course of visual modulation in the auditory ERPs can be characterized in animal models. EEG signals were recorded in sedated cats from subdermal needle electrodes. The auditory stimuli (clicks) and visual stimuli (flashes) were timed by two independent Poison processes and were presented either simultaneously or alone. The visual-only ERPs were subtracted from audiovisual ERPs before being compared to the auditory-only ERPs. N1 amplitude showed a trend of transiting from suppression-to-facilitation with a disruption at ~ 100-ms flash-to-click delay. We concluded that visual modulation as a function of SOA with extended range is more complex than previously characterized with short SOAs and its periodic pattern can be interpreted with “phase resetting” hypothesis.


Visual modulation of auditory evoked potentials in the cat
Xiaohan Bao 1 & Stephen G. Lomber 2* Visual modulation of the auditory system is not only a neural substrate for multisensory processing, but also serves as a backup input underlying cross-modal plasticity in deaf individuals.Event-related potential (ERP) studies in humans have provided evidence of a multiple-stage audiovisual interactions, ranging from tens to hundreds of milliseconds after the presentation of stimuli.However, it is still unknown if the temporal course of visual modulation in the auditory ERPs can be characterized in animal models.EEG signals were recorded in sedated cats from subdermal needle electrodes.The auditory stimuli (clicks) and visual stimuli (flashes) were timed by two independent Poison processes and were presented either simultaneously or alone.The visual-only ERPs were subtracted from audiovisual ERPs before being compared to the auditory-only ERPs.N1 amplitude showed a trend of transiting from suppression-to-facilitation with a disruption at ~ 100-ms flash-to-click delay.We concluded that visual modulation as a function of SOA with extended range is more complex than previously characterized with short SOAs and its periodic pattern can be interpreted with "phase resetting" hypothesis.
It has been almost unanimously agreed that the cross-modal timing between two stimuli plays a key role in multisensory processing 1,2 (see Koelewijn 2 for a review).An audiovisual disparity, or stimulus onset asynchrony (SOA), of ~ 100 ms could substantially impede the perception of simultaneity [3][4][5] and provided sufficient information for temporal order judgement 6,7 .The improvement on the performance of perception (e.g., reaction time or accuracy) by adding stimulus from a second modality is also diminished with increasing audiovisual SOA [8][9][10][11][12] .Such time sensitivity indicates that the complexity of neural circuits that are not fully understood yet is involved in audiovisual interactions, and potentially cross-modal plasticity after hearing loss.
The range of SOAs up to 100 ms, which cross-modal temporal processing (simultaneity and temporal order judgement) is sensitive to [13][14][15] , has been studied in human ERP and MEG experiments [16][17][18][19][20][21][22] .We refer to SOAs of this range as "short SOAs" and both types of studies have shown that short SOAs can modulate the multisensory component of ERP activities.However, longer SOAs were not extensively studied in these human ERP experiments.
Using extracellular recording or behavioral measurements, a few investigations have shed some light on the effect of long SOAs in multisensory processing.In macaque primary auditory cortex, Lakatos et al. 23 showed that neuronal activities evoked by a click were modulated by a preceding tactile stimulus with up to about 800-ms SOA.Fiebelkorn et al. 24 measured the fluctuated behavioral performance in detecting a near-threshold Gabor stimulus after a preceding tone beep up to a 6-s SOA.The findings in both studies have implied that the effect of long SOAs on multisensory interaction is due to the oscillations in the cortical excitability phase-locked to the preceding stimulus.This would be contradictory with the evoked model 25 , where stimulus-evoked neural activity by the preceding stimulus may have a more limited effective period.Thus, we hypothesized that, in auditory ERPs, cross-modal modulation originating from a visual input should also occur with audiovisual temporal disparity beyond the range sensitive for multisensory temporal processing, where a periodic pattern of fluctuation may be observed.
The existing ERP studies on the temporal disparity of audiovisual integration provided very limited information specific to long SOAs and its spectral patterns [26][27][28][29][30][31][32] .To fill this research gap, the current study is aimed at providing unparalleled evidence of the interaction between cat ERPs in response to auditory (click) and visual (flash) stimuli and audiovisual SOAs up to 1 s (Fig. 1).We found that the amplitude of N1 from cortical auditory evoked potentials (cAEPs) in cat under dexmedetomidine sedation was affected by audiovisual SOAs.Change in N1 amplitude as a function of SOA revealed a temporal dynamic of visual modulation in an oscillatory pattern.

Results
Cats under dexmedetomidine sedation were presented with 1-min trains of clicks (auditory, A), flashes (visual, V), and unsynchronized clicks and flashes (audiovisual, AV) (Fig. 1a).Then, an offline bandpass filter (1-10 Hz) was applied for obtaining cortical auditory evoked potentials (cAEPs).First, we extracted epochs time-locked to click onsets from all three stimulus conditions.The grand-averaged waveforms derived from both the AV and the A conditions revealed clear cortical auditory evoked potentials (cAEPs) but not the waveform from the V condition (Fig. 2a).The flash stimuli did not seem to influence the grand-averaged waveforms of click cAEPs, due to the fact that flash and click stimuli were out of sync.This, however, may not be the case, when specific flash-to-click delays were to be investigated.Therefore, EEG signals from V condition were subtracted from the corresponding AV condition in each of the 10 repeats, generating an AV-V condition (Fig. 1b).For further data analysis, epochs were extracted from the derived AV-V and the original A conditions, respectively, for waveform averaging and peak measurements (Fig. 2b).
Data were collected from 14 cat subjects.Regardless of the flash-to-click delay, each subject was presented with 370 clicks with 10 repeats, giving rise to an average of 3700 epochs in each individual cAEP waveforms (Supplementary Fig. 1a).The cAEP waveforms from the both conditions featured a prominent positive peak component about 35-ms latency, which we referred as P1, followed by a slower and wider negative peak component at about 95-ms latency post-click, which we referred as N1 (Fig. 2b).A second positive peak component, less prominent than P1, was present at about 170-ms latency, which we referred to as P2.From the grand-average waveforms, we observed a near-perfect overlap between the AV-V and the A conditions, especially for the initial 125-ms duration after click onset, suggesting a well-preserved cAEP morphology when unsynchronized visual stimuli were simultaneously present.There appeared to be an elevation of the traces starting at 150-ms after click onset in the AV-V condition.
The P1-N1-P2 complex were observed in all subjects.The amplitudes and the latencies were measured from each of the three peak components (Table 1 and Supplementary Fig. 1a).Only P2 amplitude was significantly larger in the AV-V condition than the A condition (Δamp P2 = 0.14, p = 0.007 < 0.01).It was noticed in the later analysis that three subjects demonstrated more noise in their recordings.It became more apparent in peak identification, when cAEPs were analyzed in separate click groups according to the flash-to-click delay (Supplementary Fig. 1b).Excluding these three subjects, however, did not change the result of comparisons between the AV-V and the A conditions above (Δamp P2 = 0.18, p = 0.005 < 0.01).Although P2 amplitude demonstrated the effect of visual modulation without depending on the timing between flash and click stimuli, P1 and N1 components, as well as P2 latency, did not, which is consistent with the existing knowledge that out-of-timing visual stimulus does not www.nature.com/scientificreports/affect auditory processing 33 .To investigate how stimulus timing plays a role in the effect of visual modulation, we focused on N1 amplitude as the major measurement in the following data analysis.

The effect of flash-to-click delay on visual modulation of cAEPs
To examine the relationship between audiovisual temporal disparity and visual modulation of auditory processing, we sorted all the click stimuli by their flash-to-click delays (Fig. 1c).At first, we created 8 groups of with  www.nature.com/scientificreports/equal number of clicks in each group.In this case, the first click group was composed of clicks with a flash-toclick delay between 0 and 79 ms, while the last group was composed of clicks with a flash-to-click delay between 894 and 1731 ms.Detailed descriptive statistics on the flash-to-click delays were listed below (Table 2).Next, the cAEP waveforms were derived from each of the 8 click groups (Supplementary Fig. 1b), and therefore the contrast between the A and the AV-V conditions for each click group can represent for the cortical processing of click stimuli under the influence of visual modulation with a specific window of audiovisual temporal disparity.We first compared the range of N1 amplitudes across the 8 click groups.It appeared that there was a larger range of N1 amplitude across the 8 click groups in the AV-V condition than the A condition (Supplementary Fig. 1c), although this difference was not statistically significant.
Next, one-way repeated-measure ANOVA was performed to test the statistical effect of click group on the change of N1 amplitude (amp N1 ) against the variance across subjects.We found a significant main effect of click group (F 10, 70 = 2.72, p = 0.015 < 0.05).Given the small sample size, we also carried out a permutation test, where the correspondence between the click groups and the Δamp N1 were randomly scrambled for each subject independently.This allowed us to determine a false discovery rate of 1.0% when accepting 0.015 as the alpha level.
To further identify the specific click groups that demonstrated delay-dependent visual modulation, we performed Wilcoxon sign rank tests in each of the 8 click groups, comparing Δamp N1 with either 0 (i.e., assuming no visual modulation at all as the null hypothesis) or the Δamp N1 derived from each subject without click grouping (i.e., assuming no delay dependency as the null hypothesis).In both approaches, a significant suppression of N1 amplitudes, as indicated by a positive Δamp N1 , was found for the 34-ms click group and the 198-ms group click group (Fig. 3).Again, we used the same permutation procedure described above to confirm that accepting both positive findings (34-ms: p = 0.008 < 0.01; 198-ms: p = 0.013 < 0.05) yielded an accumulated false discovery rate of 0.3% when Δamp N1 values were compared to zero.The other click groups failed to reveal a statistically significant visual modulation, suggesting that visual modulation in those ranges of audiovisual temporal disparity was less consistent across subjects.We also explored the other number (from 2 to 12) for click grouping and found that the pattern how visual modulation of N1 amplitude depends on audiovisual temporal disparity can be consistently observed using 7-bin, 8-bin, 9-bin, 10-bin, 11-bin grouping of clicks (Supplementary Figs. 2 and 3).

Visual modulation of N1 amplitude predicted by audiovisual temporal disparity
Finally, we adopted from kernel regression procedure for weighing each of the 370 click epochs to predict the cAEP waveforms specific for a given audiovisual temporal disparity (audiovisual SOA), which we also termed as a Gaussian-weight averaging approach (Supplementary Fig. 4).For any given SOA, epochs were averaged with weight values derived from a Gaussian kernel centered at this SOA.The bandwidth of the Gaussian kernels was controlled by the parameter σ, which was selected to be 100-, 50-, 20-, 10-, 5-ms (Fig. 4a-e), concerning the Table 2. Descriptive statistics about the flash-to-click delays in each of the eight click groups.www.nature.com/scientificreports/trade-off between bias and variance of the prediction.Similarly, N1 amplitudes were measured and contrasted between the A and the AV-V conditions.The temporal course of visual modulation in N1 amplitude can be characterized by directly plotting Δamp N1 as a function of audiovisual SOA (Fig. 4a-e, Left).The lack of clicks with long flash-to-click delays exerted additional variance to the prediction near the end of the evaluated SOA range.To alleviate its interference, we obtained the proportion of greater Δamp N1 than the data obtained through 1000 permutations, where all the flash-to-click delays were randomly assigned to the 370 click epochs (Fig. 4a-e, Right).Additionally, to monitor the quality of peak detection, N1 latency was measured at the same time.
Using the kernels with a large bandwidth (σ > 20 ms), we observed an overall transition from visual suppression to facilitation of N1 amplitude at ~ 300-ms SOA (Fig. 4a-c).Using the kernels with a smaller bandwidth, an early and transient facilitation can be identified at ~ 100-ms SOA (Fig. 4c-e).Such temporal dynamic was also partially captured by the analysis demonstrated earlier where the clicks were grouped in discrete bins.Furthermore, strong visual modulation on N1 amplitude was also revealed at multiple SOAs like 300-and 400-ms, when the kernels with a small bandwidth were used (Fig. 4d), suggesting multiple temporal integration windows for audiovisual interaction.

Discussion
In this study, we examined and demonstrated the effect of audiovisual temporal disparity or stimulus onset asynchrony (SOA) on visual modulation of cortical auditory evoked potentials (cAEPs).The audiovisual interaction was investigated using similar approaches in two previous human ERP studies, with SOAs below 100 ms 17  www.nature.com/scientificreports/70 ms 16 , respectively.A few studies using extracellular recordings examined SOAs up to 500 ms in the superior colliculus 1 and 320 ms in auditory cortices 34 .These studies have made the discoveries of the neural correlates to the "temporal window of integration" that were measured behaviorally, demonstrating strong evidence for a "coincidence detector" as a neurophysiological mechanism [35][36][37] .Long SOAs, despite not likely being involved with the temporal integration or temporal processing (perception of multisensory simultaneity and temporal order), are still possible for effective cross-modal modulation of sensory processing.This idea has been supported by both behavioral data 24,38,39 and some neurophysiological evidence 23,40 .Lakatos et al. 23 pointed out that the optimal SOAs for tactile modulation of sound-evoked neuronal activities in their data were associated with the periodic intervals of several EEG oscillations.According to the "phase reset" hypothesis they proposed, a preceding tactile stimulus resets the phase of ongoing neural oscillations in the primary auditory cortex, which in turn determines the state of fluctuating auditory excitability.When the SOA between the preceding tactile stimulus and the following auditory stimulus is aligned to the high-excitability, up-phase of neural oscillation, the auditory stimulus evokes a larger response than when tactile-auditory SOA is aligned to the low-excitability, low-phase of neural oscillation.The observation of excitability fluctuation has been further evidenced with various behavioral and electrophysiological measurements, including extracellular recording 34 , human ERP 16 , phosphine induced by transcranial magnetic stimulation 40,41 , and reaction time 24,38,39 .Although our analysis was mainly focused on the prediction of visual modulation by audiovisual temporal disparity, the result did exhibit a pattern of fluctuating suppression/facilitation as SOA increased from 0 to 1000 ms.It is worth noting that neither auditory nor visual stimuli in this study was dedicated as a periodic inputs.Therefore, the oscillation in visual modulation we observed may reflect an intrinsic property of neural networks.
One of the many missions of the future multisensory research is to converge the knowledge established from extracellular recordings in animal models and from whole-brain imaging in humans.While data of intracranial recordings in human are still rare and challenging to obtain, scalp-EEG recordings from large animal models are quickly developing as a uniquely useful neurophysiological approach, such as marmoset 42,43 and cat [44][45][46][47][48] .
Electrical and magnetic mappings of whole-brain activities during audiovisual perception have provided valuable insights on its neural mechanism involving intra-cortical functional connectivity 49 and topographic re-distribution 26 .Human auditory evoked potentials have been well-characterized for a variety of components as neural correlates to sound processing at different stages of ascending auditory pathway 50,51 .The current study is the first scalp-recorded EEG multisensory study in animal models, and is, infrequently in literature, focused on auditory evoked potentials under visual modulation.We compared ERPs from the auditory-only condition with a derived condition by subtracting signal of the visual-only condition from the audiovisual condition, rather than compare the difference between audiovisual condition with a derived condition by "sum of the auditory and the visual conditions".This allowed us to select peak components time-locked to auditory stimuli, which are supposed to have better interpretability for auditory processing.
To summarize, in this study we mainly characterized N1 amplitude in scalp-recorded auditory evoked potentials (AEPs) from cats under dexmedetomidine sedation as a measurement for visual modulation of auditory processing.We found that the delay function, sampled with both sparse grouping approach and fine-resolution weight-average approach, revealed a short-SOA effect peaking at ~ 100 ms, which was followed by a long-SOA effect characterizing the time course of visual modulation over ~ 1-s period.With the advantages of our animal models and experiment paradigms, future studies are expected to characterize the spectrotemporal features in normal and sensory-deprived subjects and to identify the neural mechanism underlying cross-modal interactions.

Methods
All procedures were conducted in compliance with the National Research Council's Guide for the Care and Use of Laboratory Animals (8th edition; 2011), the Canadian Council on Animal Care's Guide to the Care and Use of Experimental Animals (1993), and the ARRIVE guidelines.Furthermore, the following procedures were also approved by Animal Care Committee (DOWB) for the Faculty of Medicine and Health Sciences at McGill University.

Animal preparation and anesthesia protocol
Cats (felis catus) were obtained from a commercialized animal breeder for biomedical research (Marshall Bioresources).We recorded 14 cats with average age of 4.7 ± 1.5 years old, two of which were male.After subjects were sedated using dexmedetomidine (0.04 mg/kg, Dexdomitor, Zoetis) injected intramuscularly, the left eye was occluded using a black contact lens so that visual stimuli were presented unilaterally.Phenylephrine (Mydfrin, Alcon) was applied to the right eye to dilate the pupil, and saline drops were used as lubrication.Subjects were placed on a water-circulated heating pad (TP-400, Gaymar).Once vital signs (heart rate and SpO2) were stable, two 15-min recording sessions were carried out while the subject was breathing pure oxygen (Dispomed).At the end of the two recording sessions, data collection terminated in nine subjects and continued in the other five under isoflurane anesthesia for a separate study.Subject's vital signs and electrode impedance were checked between the two sessions.At the end of data collection, electrodes and contact lens were removed before atipamezole (Antisedan, Zoetis) was administrated intramuscularly to facilitate recovery from the dexmedetomidine sedation.

Visual and auditory stimuli
The visual stimuli consisted of flashes that were presented to subjects from a 5-mm-diameter light-emitting diode (~ 11 degrees of visual field, LED, DigiKey).The intensity of flash stimuli was calibrated to 10 cd/m 2 by adjusting

Figure 1 .
Figure 1.Stimulus paradigm and click grouping based on flash-to-click delays.(a) Three stimulus conditions, audiovisual (AV), auditory only (A) and visual only (V) were presented 10 times to each subject while EEG signal was continuously recorded.Same click train and flash train were repeatedly used in all three conditions.(b)EEG signal from the V condition was subtracted from AV condition in each repeat to generate an AV-V condition.For both AV-V and A conditions, epochs time-locked to click onsets were extracted and were averaged to derive cortical auditory evoked potentials (cAEPs).(c) For investigating the effect of audiovisual temporal disparity, clicks were sorted by flash-to-click delays and grouped into different bins.This way, cAEP waveforms can be obtained separately from different click groups.Flash-to-click delays overall spanned from 0 to about 1000 ms.

Figure 2 .
Figure 2. Cortical auditory evoked potentials (cAEPs) from all stimulus conditions.(a) Grand-averaged waveforms of cAEP in three stimuli conditions.The epochs were averaged with click onsets.Note that in the case of visual-only (V) condition, the click onsets were the same as in the auditory-only (A) and the audiovisual (AV) condition, despite that no click was presented.(b) Contrast of cAEP waveforms between the A and the AV-V conditions.Inset, an enlarged view of the waveform near the click onset and the baseline between the two vertical lines (from 5-ms before to 5-ms after click onsets).

Figure 4 .
Figure 4. Visual modulation of N1 amplitude depends on audiovisual temporal disparity.(a-e) For kernels with different bandwidth (σ), change in N1 amplitude as predicted by audiovisual SOA derived from Gaussianweight averaging of cAEPs.Left, the original Δamp N1 .Right, proportion of permutation-derived Δamp N1 smaller than the original Δamp N1 .Dotted line, peak detection with large variance indicated by latency beyond 150 ms or less than 55 ms.

Table 1 .
Amplitudes and peak times of P1-N1-P2 complex in individual subjects.

Flash-to-click delay (ms)
Suppression Facilitation Figure 3.Effect of audiovisual temporal disparity on visual modulation of N1 amplitude.Median of change in N1 amplitude for each of the 8 click groups.The median of flash-to-click delays were used as horizontal coordinates.Errorbar, half of the inter-quartile range across subjects.The red-dash line, the null hypothesis with no visual modulation.Blue errorbar, the inter-quartile range of Δamp N1 across subjects without click grouping.Vol.:(0123456789) Scientific Reports | (2024) 14:7177 | https://doi.org/10.1038/s41598-024-57075-1