A dynamical framework to relate perceptual variability with multisensory information processing

Thakur, Bhumika; Mukherjee, Abhishek; Sen, Abhijit; Banerjee, Arpan

doi:10.1038/srep31280

Download PDF

Article
Open access
Published: 09 August 2016

A dynamical framework to relate perceptual variability with multisensory information processing

Bhumika Thakur¹,
Abhishek Mukherjee²,
Abhijit Sen¹ &
…
Arpan Banerjee²

Scientific Reports volume 6, Article number: 31280 (2016) Cite this article

3140 Accesses
9 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Multisensory processing involves participation of individual sensory streams, e.g., vision, audition to facilitate perception of environmental stimuli. An experimental realization of the underlying complexity is captured by the “McGurk-effect”- incongruent auditory and visual vocalization stimuli eliciting perception of illusory speech sounds. Further studies have established that time-delay between onset of auditory and visual signals (AV lag) and perturbations in the unisensory streams are key variables that modulate perception. However, as of now only few quantitative theoretical frameworks have been proposed to understand the interplay among these psychophysical variables or the neural systems level interactions that govern perceptual variability. Here, we propose a dynamic systems model consisting of the basic ingredients of any multisensory processing, two unisensory and one multisensory sub-system (nodes) as reported by several researchers. The nodes are connected such that biophysically inspired coupling parameters and time delays become key parameters of this network. We observed that zero AV lag results in maximum synchronization of constituent nodes and the degree of synchronization decreases when we have non-zero lags. The attractor states of this network can thus be interpreted as the facilitator for stabilizing specific perceptual experience. Thereby, the dynamic model presents a quantitative framework for understanding multisensory information processing.

Coupled oscillations enable rapid temporal recalibration to audiovisual asynchrony

Article Open access 11 May 2021

Perception of temporal synchrony not a prerequisite for multisensory integration

Article Open access 29 February 2024

The development of audio–visual temporal precision precedes its rapid recalibration

Article Open access 14 December 2022

Introduction

Multisensory signals seamlessly enrich our knowledge of the world¹. For example, while driving a car or attending to a vocalizer in a noisy background we cannot rely on only one sensory modality be it visual or auditory, rather a harmonious interaction of visual and somatosensory or visual and auditory systems is required. In recent years, multisensory nature of perception has been the focus of much behavioral and neuroscientific research². Merging information from different senses confers distinct behavioral advantages, for example, identification of audio-visual (AV) objects is more rapid³ than with unimodal stimuli^4,5, especially when the signals are ambiguous^6,7. To realize these advantages, the brain continually coordinates sensory inputs across the audiovisual^8,9, visual-tactile^10,11 and audio-somatic¹² domains and combines them into coherent perceptual objects. However, operational principles of how environmental and neural variables modulate multisensory processing underlying perception are poorly understood¹³. The goal of this paper is to propose a theoretical framework that explains the empirical observations from a paradigmatic framework widely acknowledged to be an entry point in studying multisensory processes.

McGurk and MacDonald¹⁴ demonstrated that the sound of articulating /pa when superimposed on a video of lip movement during articulation of /ka, resulted in an illusory experience of /ta for the perceiver–a phenomenon now known as the McGurk-illusion/effect. Similar triads were also reported such as auditory /ba and visual /ga resulted in perception of /da. Several researchers have used this paradigm to study the key behavioral and neural variables that can modulate perception^15,16,17. Ambiguity in one of the sensory streams arising from noisy stimuli affects the neuronal processing of multisensory stimuli¹⁵. For example, adding white noise to auditory stimuli resulted in an enhanced functional connectivity between the visual cortex and the posterior superior temporal sulcus (pSTS), a brain area populated with multisensory neurons¹⁵. On the other hand, an increase in ambiguity of visual stimuli (e.g., blurry video) resulted in an enhanced functional connectivity between auditory cortex and pSTS. The relative timing of different sensory input signals is another important factor in multisensory information processing¹⁶. Temporal proximity is a critical determinant for cross-modal integration by multisensory neurons^2,18. For instance, the audiovisual integration of speech breaks down if the asynchrony between the visual lip movements and the auditory speech sounds becomes too long^16,19,20. However a large temporal window exists over which successful integration may occur^16,21,22,23. The McGurk illusion, for example persists even when the visual information leads (by up to 240 ms), or lags (by up to 60 ms) the auditory input¹⁶. In fact, Munhall and colleagues reported the peak of illusory response happens when the visual stimuli leads the auditory stimuli by 180–200 s¹⁶.

One important component not much explored in the literature is that of an integrative framework/ model that can elucidate the dynamic interactions among environmental and neural variables underlying multisensory processing of stimuli that shapes perception. The existing computational modeling literature address the spatial and temporal integration of incoming multisensory stimuli either at behavioral level or at the neural level^{24,25,26,27,28,29,30}. Typically most models have 3 components, two unisensory and one multisensory module and attempt to address the integration mechanisms at the level of single units and populations at the neural level (inspired by the presence of multisensory neurons)³¹ for constructing a model. Nonetheless, the modeling approaches can be broadly classified into primarily two classes. A Bayesian framework was used by few researchers to explain how unisensory streams of audition and vision can integrate to facilitate perception by solving the spatial localization problem^24,25. The second approach uses dynamic models of underlying neural systems that can be further subdivided in two classes. The first is biologically inspired modelling of neural dynamics to understand the role of predictive coding²⁶ as well as spatial and temporal aspects of multisensory processing^27,30 (for a review on neurocomputational approaches to modelling multisensory integration in the brain, see Ursino et al.³²). The second is that of using minimal models with least amount of parameters and variables to identify the key variables affecting dynamic changes in behavior, e.g., acoustic parameters such as gap duration facilitate phonemic categorization³³. For non-speech sounds and multisensory processing, a dynamic model was introduced by Dhamala and colleagues²³ that explained the phenomenon of drift when slightly asynchronous audio-visual stimuli are presented. The current study incorporates the key environmental variables affecting the McGurk paradigm in a minimal model to understand their relationships with neurally relevant parameters such as the connectivity between the unisensory and multisensory systems and their potential role in oscillatory brain dynamics.

Electrophysiological signals can be conceptualized into patterned oscillations and decomposed into five frequency bands that are physiologically meaningful, delta (0.5–3.5 Hz), theta (4–7 Hz), alpha (8–12 Hz), beta (13–30 Hz) and gamma (>30 Hz)³⁴. Beta band power is enhanced in fronto-parietal areas in trials where subjects perceive McGurk illusion accompanied by a reduction in theta power¹⁷. In general, beta band synchronization has been associated with audio-visual stimulus perception³⁵. Phase synchrony and phase modulation of oscillations across the different frequency bands have been suggested to play a key role in the organization of cortical networks engaged in complex cognitive functions such as speech processing³⁶ and constitute a critical component of auditory-articulatory alignment³⁷. There is mounting evidence that coherence of oscillatory neural signals across cortical areas might be a crucial mechanism involved in multisensory processing^34,38,39,40. Theta rhythms have been particularly associated with auditory perception of syllabic speech⁴¹. A recent article emphasizes the role of coherence between lip movement and brain oscillations at low frequency for intelligibility of speech stimuli⁴². Nonetheless, how perceptual categorization parametrically varies with the window of temporal integration and how oscillatory cortical activity observed by EEG/ MEG studies using McGurk-paradigm^3,17,43 relates to perception is unclear.

Developed in 1970’s, the Kuramoto model⁴⁴ of coupled phase oscillators is commonly used for contructing theoretical models of neurobiological networks with oscillatory dynamics^45,46. In this model, the phase of an oscillation exhibited by any node of a network becomes the key variable of interest, affected by free parameters such as coupling/ connectivity terms. Thus, each Kuramoto oscillator can depict the oscillatory state of a sub-network (e. g., sections of cortical columns), captured by their circular phase alone and the overall synchronization states can capture the collective dynamics of the network. A network of Kuramoto phase oscillators provides a dynamic framework to explain the functional connectivity changes in the brain electromagnetic data⁴⁷. The original Kuramoto model and its extensions have been used to explore mechanisms underlying oscillations in the human cortex⁴⁵. In this article we propose a dynamical model comprising of three coupled Kuramoto oscillators, coupled via electric coupling and time-delay. Electric coupling captures physiological constraints of the audio-visual system and time-delay captures the environmental factor of temporal asynchrony (see also)⁴⁸. Empirically such networks can be imaged non-invasively from EEG/ MEG studies⁴⁹. Computationally, presence of time-delays enriches the attractor space of a dynamical system⁵⁰. Our proposed model predicts the behavior of McGurk perceivers reported under ambiguous stimuli scenarios as well as during variation of AV onset lags. The rest of the paper is organized as follows. In the Methods section, we present the behavioral experimental paradigm using McGurk like stimuli, where one can study the relationship between perceptual experience and psychophysical parameters such as audio-visual onset lags. Secondly, we propose a theoretical model of multisensory perception using symmetry arguments. In the Results section we perform statistical analysis on the behavior and report the model performance under various parameter set-ups. Finally, in the Discussion section, we discuss the theoretical results in context of the experimental paradigm and argue how this modelling framework captures the key features of complex multisensory integration processes and can potentially be helpful for explaining other experimental paradigms as well.

Methods

Experimental paradigm

Subjects and stimuli

Fifty-two healthy right-handed adult participants (25 female, 27 male) of age range 20−35 years (mean age = 24.5 years, SD = 3.12) participated in a behavioral study of duration of about 45 minutes. The undertaken study design was approved by Institutional Human Ethics Committee (IHEC), National Brain Research Centre (NBRC) and the study was carried out in accordance with the guidelines set by IHEC, NBRC and in strict adherence to the declaration of Helsinki. All participants provided written informed consent in a format approved by IHEC, NBRC and reported normal vision and hearing and no history of neurological disorders. 7 video stimuli, each of 2 seconds(s) duration were prepared, of which 6 were incongruent audio-visual objects where audio recordings of a human speaker vocalizing /pa is dubbed on the lip movement of vocalization /ka (/pa-/ka) and 1 was a congruent audio-visual object /ta (/ta-/ta). The gap in onset of auditory and visual streams was varied from −300 to 450 ms in steps of 150 ms in the 6 incongruent videos. Negative sign implies that auditory stimulus onset preceded that of lip movement onset and positive implies that lip movement starts before the sound. An asymmetric range of AV lags was chosen for incongruent trials, because a previous study by Munhall and colleagues¹⁶ reported that the dominance of illusory perception was skewed towards positive lags where the start of lip movement precede sound onset. The congruent /ta-/ta video had synchronous onset of AV stimuli. The male speaker’s lips were in a neutral closed position during when not engaged in utterance of the syllables /ka or /ta and the articulation always started from a neutral position. Videos were created/ edited using VideoPad video-editing software (NCH Software, CO) at a frame rate of 25 frames/second and a resolution of 1280×720 pixels. The auditory /pa and /ta syllables were of 0.549 s and 0.531 s duration and were edited in Audacity software (Sourceforge.net) to minimize the background noise. The audio sampling rate was 44 kHz and had a bit rate of 128 kbps. The study was done inside a 3T MRI scanner as part of a brain imaging investigation, the results of which will be presented elsewhere.

Task

The task design is illustrated in Fig. 1. Inside the MR scanner, the stimuli were presented in a block design with 20 s activation blocks consisting of 10 videos of one kind of AV lag. In total there were 28 activation blocks in the whole experiment inclusive of 4 activation blocks for each stimuli category (AV lag). There were alternating 28 resting blocks, each of 20 s duration. The order of presentation of activation blocks was randomized and the same kind of block never appeared consecutively. Within a block the trial videos comprised of one kind of lag and this is a limitation of the fMRI block design. AV stimuli were presented through a INVIVO MR - compatible CRT screen attached to the head-coil and MRI-compatible headphones (Philips, The Netherlands). Presentation software (Neurobehavioral Systems, CA) was used to display the stimuli. Participants were presented with the stimuli (Fig. 1) and asked to indicate their response based on their perception via three buttons designated for /ta, /pa and for “any other” perceptual categories. They were instructed to attend to the audio-visual stimuli and watch the speaker at all times. A fibre-optic button-pad by Curdes (Current Designs, PA, USA) was used to record the responses of the participants.

Theoretical framework

Multisensory systems research has been built on analogies drawn from constituent unisensory processing modules. For example multisensory processing of audio-visual inputs was studied in comparison to standalone auditory and visual processing^3,31,51. However, increasing evidences suggest that a network of unisensory and multisensory systems may be involved in multisensory task processing^15,17,23. Thus, it is reasonable to assume that the most elemental model of multisensory perception will involve at least two unisensory systems and one multisensory system^11,24,26,27 even though a recent hypothesis suggests two unisensory streams can in principle give rise to multisensory effects¹³. Furthermore, changes in oscillatory brain rhythms such as pre-stimulus enhancement of beta power and post-stimulus depreciation of theta band power have been identifed as the hallmarks of illusory perception in pre-stimuli or post stimuli regimes^17,52 respectively.

As discussed in the Introduction, Kuramoto oscillators^44,53 are increasingly becoming a handy tool to model oscillatory brain dynamics. The Kuramoto model was originally developed as a system of phase oscillators, each rotating at different intrinsic frequencies and coupled through the sine of their phase differences. Since then the model and its modifications have been used to study a number of biological and other phenomena such as chorusing crickets, flashing fireflies, pacemaker cells in heart. In the Kuramoto model, the phases of the individual oscillators evolving over time obey the following equations

where θ_n is the phase and ω_n is the intrinsic frequency of the n^th oscillator and κ_nm is the strength of the coupling between the n^th and m^th oscillator. Here the over-dot over θ_n represents the time derivative.

We consider a system of three coupled Kuramoto phase oscillators configured in the manner shown in Fig. 2. The phase of oscillator A representing the auditory system is θ₁, the phase of the oscillator V representing the visual system is θ₂ and the phase of the oscillator AV representing the multisensory system is θ₃. Each oscillator has a distinct natural frequency of oscillation, denoted by ω₁, ω₂ and ω₃ respectively. Thus, our model captures the functional connectivity between unisensory systems such as auditory, visual cortices and the multisensory system e.g., posterior superior temporal sulcus (pSTS) that has been observed in functional imaging studies¹⁵ via the coupling parameters κ₁ and κ₂.

Furthermore, our experimental results as well as from Munhall and colleagues¹⁶ indicate that a crucial parameter in creation of perceptual states is the AV lag which we capture through the time delay τ. In normal hearing and visual circumstances one can expect the coupling between individual sensory systems and multisensory system to be balanced. However, perturbations to one of the sensory streams such as unreliable visual or auditory signals can lead to a situation of unbalanced coupling as shown by Nath and Beauchamp¹⁵. There is no direct coupling between A and V oscillators in our model though in the neural system there are evidences of connectivity between A1 and V1⁵⁴. This is primarily because we are interested to understand the key symmetries of the most simple multisensory dynamical model catered towards understanding oscillatory states of brain. Nonetheless, A and V do interact functionally in our model, since they are both coupled to AV, the functional unit of multisensory system. To model the experimental situation of AV lags (positive or negative) between the audio and visual stimuli the coupling between oscillators 2 and 3 are time delayed by the parameter τ. When the visual stimulus precedes the auditory stimulus (positive lag) the dynamics of phase oscillators is expressed as

where κ₁ is the strength of interaction between the auditory(A) and the AV oscillator and κ₂ is the coupling strength between the visual (V) and the AV oscillator. When the auditory precedes the visual stimulus, one can consider the possibility of τ being negative. However, physically it does not make sense to make the present dynamics dependent on the future. Alternatively, one can intuitively relate the situation of negative τ to a situation where oscillator A has a time delayed coupling with AV while the visual oscillator V is instantaneously coupled. Hence, when the auditory stimulus precedes the visual stimulus, the phase oscillators can be represented as

Equations 2 and 3 are similar in form but are not mirror images due to the difference in the individual frequencies of the oscillators that gives rise to asymmetry. When the oscillators are allowed to interact the frequencies of oscillations of all three units synchronize with various phase relationships (analytically derived in Results section for a simple case).

States of synchronized oscillations across a chain of phase oscillators can be mathematically identified by defining a complex order parameter, Z, which gives a quantitative measure of the synchronization among the oscillators and can be expressed as,

Here the amplitude R(t) is a measure of both the synchronization of the frequencies of the three oscillators as well as the phase coherence of the oscillators and Φ(t) measures the average phase of the system. The system can get synchronized in frequency as the coupling strengths κ₁ and κ₂ are increased but the oscillators can still be separated in their mutual phases while oscillating at the same frequency. For phase synchronized state a value of R = 1 represents maximum synchronization (0 phase difference) among the oscillators and R = 0 means maximally separated phases among individual oscillators.

Earlier studies have proposed that perceptual experience can be qualitatively conceptualized as attractor states of a dynamical system^23,33. Stable synchronization patterns at certain relative phase relationships are attractors in a dynamical system and instabilities signify switches in perceptual state. In this framework, the state of the order parameter or the collective variable R can be representative of the overall perceptual categorization and can be studied as a function of the interactions among constituent psychophysical variables, coupling strengths, the spread in the frequencies of the oscillators as well as the time delay parameter τ. Time delay τ can be interpreted as the temporal window over which integration of AV information takes place.

Results

Behavior

The button press responses to perceptual experience collected from each participant during the experiment were analyzed offline using customized MATLAB codes. A maximum of 40 responses were expected for each AV lag condition. Tasks with less than 35 responses were rejected since estimates of perceptual categorization may be biased during computation of percentage responses. No subjects had to be rejected based on this criteria. Behavioral responses of attempted trials of each participant were converted into percentage measures for each perceptual category, /pa, /ta or “other” corresponding to AV lags over a range [−300, 450] ms. Subsequently, 34 participants who reported perceiving /ta for at least 60% of the total responses at any AV Lag are defined as “McGurk-perceivers” and simply referred to as perceivers.

We observed that perceivers report /ta maximally at an AV lag of 0 ms when the lip movement of the speaker is synchronous with the onset of auditory stimulus (Fig. 3). Conversely, /pa perception was reported the least number of times at AV lag of 0 ms. We performed an Analysis of Variance (ANOVA) over the percentage responses (without repetitions) in each category, /pa, /ta and “other” with lags and subjects treated as factors. For threshold of statistical significance set at p = 0.001, there was a significant change in /ta responses across lags (F(5, 165) = 24.46, p < 0.001), but not across subjects (F(33, 165) = 1.72, p = 0.0141). /pa responses also significantly varied across lags (F(5, 165) = 12.57, p < 0.001) and also across subjects (F(33, 165) = 5.43, p < 0.001). The “other” responses also significantly varied across lags (F(5, 165) = 7.11, p < 0.001) and subjects (F(33, 165) = 4.25, p < 0.001).

Dynamical model of multisensory processing

We investigated the system of equations 2 and 3 numerically and analytically to gain insights about how time-delays may influence the stability of perceptual states. We studied the model under two scenarios, one with balanced coupling and the other with unbalanced coupling. For sake of simplicity, the analytical derivations are computed for balanced coupling conditions whereas we investigate the unbalanced coupling scenarios numerically. All numerical simulations were performed using customized MATLAB (www.mathworks.com) codes and delay-differential equation solver dde23. We were interested in the collective behavior of the system of oscillators (Fig. 2, equation 2), in particular the dynamics of the order parameter that quantified the synchronized states as a function of time-delay and coupling parameters.

Balanced coupling

We solve the system of equations 2 and 3 for a range of κ = κ₁ = κ₂ values from 3–20 and for different values of the time delay (τ). In Fig. 4 we have plotted the time series of the phases θ₁, θ₂ and θ₃ at two delay values when coupling strength κ = 5. At delay τ = 0.07, all three oscillators are almost synchronized in-phase and hence the order parameter R = 0.9995. However at delay τ = 0.95, auditory A and multisensory AV oscillators are oscillating almost in-phase whereas the visual oscillator V is out-of phase and the order parameter R = 0.3326. Hence, the transition from almost complete synchronization to a lower degree of synchronization occurs at a critical value of τ. To explore the parameter space further, we computed the order parameter for different time delays (τ) and coupling strengths (κ) (Fig. 5). The right halves of the curves correspond to the “positive” audio-visual lags (where the visual stimulus precedes the auditory stimulus) and are obtained from the numerical integration of equation 2. The left halves of the curves correspond to “negative” audio-visual lags (where the auditory stimulus precedes the visual stimulus) and are obtained numerically from equation 3. In Fig. 5a, the initial intrinsic frequencies were set at ω₁ = 3, ω₂ = 4, ω₃ = 5. This was roughly in θ frequency range (3–5 Hz) which was earlier shown to be relevant for the existence of perceptual states¹⁷. The initial conditions for simulations were randomized to span all possible relative phase states. For weak coupling κ < 1, the existence of synchronization states is not possible consistently and R oscillates with time. For κ ≥ 1, a clear synchronization state appears at least for a range of τ values. Two stable states of order parameter emerge, one synchronization state around R ≈ 1 and the other around R ≈ 0.33. Furthermore, we also observed that the critical value of τ increases with increase in κ. In other words the island of synchronization expands with increasing τ. For example for κ = 10 a transition happens for a high value of τ (not shown in Fig. 4), that is not behaviourally relevant. On the other hand, we also observed that higher intrinsic frequencies of the oscillators can recede the island of synchronization. For example, if the intrinsic frequencies are set to ω₁ = 15, ω₂ = 16, ω₃ = 17 (beta band frequency) we see the transition from synchronization state occurs at lower values of τ and the extent of synchronization regime decreases (Fig. 5b). An important point to note here is that the critical transition value of τ may not be same for “positive” and “negative” audio-visual lags. This stems from the disparity in the intrinsic frequencies of the oscillators.

Multistability and analytical solutions

Multistable states of a dynamical system can be relevant for describing perception and action, in particular the inter-trial and inter-subject variability. We investigated the existence of such states in our dynamical model. The presence of a time delayed coupling provides for a rich collective dynamics of the system including the existence of multiple collective frequencies and hence the possibility of coexistence of some of these states. We have looked at this possibility by theoretical and numerical investigation of the model equations over a wide range of parameter values and randomized initial conditions in Figs 5 and 6. The system is multi-stable, meaning order parameter can take different values at same τ based on initial conditions (Fig. 6). The multistability seen in this model system may be related to the multistability seen in speech perception in terms of responses registered as /pa and “others” that occur 50% and approximately 10% of time respectively (Fig. 3).

The synchronized state (where all the oscillators oscillate with same frequency Ω) is given by θ_i(t) = Ωt + ψ_i. Let the phase shift between auditory and AV oscillator be θ₃ − θ₁ = ψ₃ − ψ₁ = ϕ₁ and the phase shift between visual and AV oscillator is θ₃ − θ₂ = ψ₃ − ψ₂ = ϕ₂. This gives θ₂(t − τ) − θ₃(t) = −ϕ₂ − Ωτ and θ₃(t − τ) − θ₂(t) = ϕ₂ − Ωτ. Using these expressions, system of equations 2 can then be written as

Summing across the sub-equations of Eq. 5, we obtain

where . Substituting Ω from Eq. 6 in Eq. 5, we obtain and . Eq. 6 gives the collective frequency Ω as a function of delay τ when we have “positive” audio-visual lags. Since it is a transcendental equation, therefore we get multiple values of synchronization frequency Ω as the solution of Eq. 6 for a given value of delay. To determine which of these corresponds to a stable synchronization state, one has to do a stability analysis.

To obtain the expression of collective frequencies for the “negative” audio-visual lags, one has to do the same analysis as above using Eq. 3. For “negative” audio-visual lags, we have

where is same as above but and .

Derivation of stability condition

We perform a linear stability analysis to determine the local stability of the synchronization state by adding a small perturbation

where 0 < ε < < 1. Taking ψ₃ − ψ₁ = ϕ₁ and ψ₃ − ψ₂ = ϕ₂ gives

Substituting these in equations 2 we obtain

Linearizing the set of equations 10 and using 5 we obtain

Taking ξ_i(t) = ν_ie^λt and substituting in the system of linear equations 11, where λ is the set of eigenvalues and ν_i are the corresponding eigenvectors, we obtain the characteristic polynomial

where A = κ cosϕ₁, B = κ cos(ϕ₂ − Ωτ) and C = κ cos(ϕ₂ + Ωτ). The stability of the synchronization frequency Ω for a given value of κ and τ can be checked using the above equation by looking at the signs of the eigenvalues λ. The frequency synchronized state with frequency Ω is stable if none of the eigenvalues have positive real parts.

Analytical expression for order parameter

The order parameter of our system is defined as

After a little algebra, we can express R in the form

For a frequency synchronized state, θ_i(t) = Ωt + ψ_i and we have taken θ₃ − θ₁ = ψ₃ − ψ₁ = ϕ₁ and θ₃ − θ₂ = ψ₃ − ψ₂ = ϕ₂. Therefore the analytical expression for the variation of the order parameter R is given by

For an in-phase synchronized state ψ₁ = ψ₂ = ψ₃. Therefore, ϕ₁ = ψ₃ − ψ₁ = 0 and ϕ₂ = ψ₃ − ψ₂ = 0 and hence R = 1. We already have the analytical expressions for ϕ₁ and ϕ₂: and . For a given set of κ, τ and intrinsic frequencies, the synchronization frequency Ω is obtained from the transcendental equation Eq. 6 by a numerical root finding method and the stability of the corresponding state is then checked using Eq. 12. Ω values corresponding to the stable states are substituted in the expressions of ϕ₁ and ϕ₂ and the values of the order parameter R are then obtained from Eq. 15. For “negative” audio-visual lags, R follows the same Eq. 15 but here Ω is given by Eq. 7, and .

In Fig. 6, we have plotted the analytical as well as the numerical values of the order parameter (R) at corresponding values of time delay (τ) for the system of oscillators having intrinsic frequencies ω₁ = 3, ω₂ = 4, ω₃ = 5 and coupling strength κ₁ = κ₂ = κ = 5 for different initial conditions. The order parameter R can have more than one value at a given value of delay giving rise to multistability and hysteresis. Figure 6 shows that the analytical results are in excellent agreement with the numerical values of the order parameter. The analytical expression for R shown in Eq. 15 holds when all the oscillators are frequency synchronized and oscillating with the synchronization frequency Ω.

Unbalanced coupling

We have investigated the unbalanced coupling scenarios following two routes. First, we have kept κ₁ fixed at the value 3 and varied κ₂ between [0.5, 5] (Fig. 7a). κ₂ < κ₁, is equivalent to the situation where ambiguity is introduced in visual stream because of which the coupling changes, e.g. in the Nath and Beauchamp study¹⁵. κ₂ > κ₁ is the situation where visual coupling is increased, auditory being unreliable. In both cases existence of a synchronization state and non-synchronous state is possible as long as κ₁ ≥ 1. When κ₁ is fixed the island of synchronization increases with increasing values of κ₂ indicating, that time-delayed interactions with the visual-multimodal system is crucial for perception. For negative AV lags, e.g., when auditory precedes the visual stimulus no drastic change in critical delay value occurs when κ₁ is unchanged.

Second, we vary κ₁ while keeping κ₂ constant. Here we see that the critical delay value for transition from phase-synchronous to non-synchronous states shifts such that the regime of synchronization increases for negative AV lags corresponding to the scenario where auditory stimulus precedes the visual stimulus (Fig. 7b). Multistable states are possible and the possibility of existence of two states, one synchronous and one non-synchronous is critically dependent on κ₁ (Fig. 7b).

Overall we find time delays and coupling strength facilitate the existence of synchronous states and the intrinsic frequency of oscillations reduces the temporal window over which AV signals can integrate. Thus, the synchronization window can extend over an entire range of AV lags. This is in qualitative agreement with the trend seen for the McGurk stimuli (Fig. 3). Existence of multistability is demonstrated in Fig. 6 explicitly but also observed for the coupling scenarios discussed in Fig. 5 and Fig. 7. Small amounts of perturbations from numerical instabilities was sufficient to generate transitions between the stable states of the order parameter.

Discussion

The main purpose of this work is to conceptualize the observations from an experimental paradigm, that has been over the years a bedrock to study multisensory information processing, with a simple dynamical model to illustrate the role of environmental variables and connectivity topologies between neural subsystems in shaping of perceptual states. The Kuramoto oscillator framework provides the added advantage of linking the perceptual states to simultaneously observed neuronal oscillations from EEG/ MEG data. The perceptual dynamics observed for /ta response by Munhall and colleagues¹⁶ and our behavioral recordings (Fig. 3) are in complete agreement with the order parameter dynamics observed from our model (Figs 5,6 and 7). We see that the illusory perception is reported maximally within a range of lags [−150, 300] ms. Beyond this range, the auditory /pa response dominates, for both positive and negative AV lags. We have considered a system of three phase oscillators, each representing the auditory, visual and multisensory systems. Superior colliculus (SC) and posterior superior temporal sulcus (pSTS) are two structures at the sub-cortical and cortical level respectively that can get representation in the multisensory system. SC is known to have bimodal neurons that can receive inputs from both the auditory and visual systems⁵⁵. Similarly, pSTS has been established by several researchers as the locus of cortical processing of multisensory integration^{15,56,57,58,59}. Thus a network comprising of auditory, visual areas and pSTS (with contributions from SC) can form the most elemental network for multisensory processing. Experimental validation about the presence of such networks have been provided by earlier studies such as Nath and Beauchamp¹⁵ using a fMRI paradigm. An important parameter in our model is the coupling between AV - A and AV - V oscillators. Careful design of such coupling parameters motivated from experimental studies can capture how the biophysically realistic symmetry that exists in the underlying neural system can influence behavior⁴⁸. Broadly, the multisensory functional unit can represent qualitatively the composite contributions of all structures such as superior colliculus, pSTS and other areas. Detailed modelling of each region along with the coupling observed in experimental fMRI data is possible, but the results from such descriptions are hard to generalize and fail to capture the overarching symmetry. Due to similar reasons we chose to ignore the possible coupling between A and V resulting from direct anatomical connections between A1 and V1 areas^54,60,61 in our model. In our model the signals from A and V are conveyed to the multisensory oscillator (AV) where information from the two modalities are integrated. The feedback connections from the pSTS to the unimodal areas are directly considered in the form of a bidirectional coupling between pSTS and auditory and visual areas. Nevertheless, the weighted contributions of direct coupling between A1-V1 and their interplay with the coupling between multisensory to unisensory areas can be studied in the dynamical framework. In future, a corresponding behavioral paradigm needs to be developed to test the predictions of such a model.

The coupling parameter in our model helps in two ways, first in the explanation of the existence and destabilization of perceptual states upon variation of time delays and second to provide a link between observed perceptual dynamics and neuronal oscillations. It is well-known that environmental demands modulate the effective connectivity dynamics among individual sensory systems and multisensory system¹⁵. The information from the more reliable modality is given a stronger weight⁶². For the McGurk paradigm, Nath and colleagues¹⁵ found that the STS was connected more strongly to a sensory cortex when the corresponding sensory modality was reliable (less noisy). In our model, κ₁>κ₂ may correspond to a scenario of increased functional connectivity between the STS and the auditory cortex when the auditory modality is more reliable and κ₂>κ₁ can be realized as increased functional connectivity between the STS and visual cortex when the visual modality is more reliable. Since, there are both top-down and bottom-up connections throughout the cortical processing hierarchy^63,64, we have incorporated bidirectional connections into our dynamical model. Importantly, for unbalanced coupling scenarios also the synchronized states exist indicating illusory experience is possible in presence of noisy stimuli. Electrophysiological studies indicate that a pre-stimulus theta band activity primes the network for multisensory information processing towards illusory perceptual categorization¹⁷. Our results indicate how the intrinsic frequencies of oscillators set at theta regime can support the perceptual dynamics as observed by varying the AV lag in the multisensory stimulus (compare Fig. 5 to Fig. 3). We predict that tuning of oscillators at higher frequency band may shorten the window of temporal integration. In other words narrower window of temporal integration will require phase synchrony at higher frequency such as beta and gamma. Interestingly our results suggest that coupling strength has to be relatively high if temporal integration is facilitated by beta band synchrony to produce the cross-modal percepts for a wider value of AV lags (Fig. 5b). This is in-line with the proposition of Luo and Poeppel⁴¹, who proposed a low and high frequency segregation of auditory processing. From our study we can predict that low frequency processing (e.g. syllabic speech) may be carried out by relatively weakly coupled unisensory and multisensory systems whereas high frequency processing e.g., diphonic speech/ tones require more stronger connectivity strengths among sensory systems. In summary, the minimal model of multisensory integration is geared towards linking the symmetries in neuronal connectivity and dynamics with environmental demands and behavior.

The existing models of multisensory integration are mostly based on Bayesian framework and neural networks. These studies consider different kinds of interactions among multisensory and unisensory areas. One category of models considers that the unisensory stimuli are processed separately in the primary cortices, without a significant cross-modal interaction and multisensory integration take place in higher associative cortical areas such as SC via feedforward convergence from multiple unimodal areas⁶⁵. Second category of models assumes only direct lateral connections between the two unimodal areas^29,30 and excludes the involvement of multisensory regions. They argue that the direct connections among early processing areas (modality-specific areas such as visual and auditory) play a pivotal role in multisensory integration. These models are based on recent anatomical tracing studies in monkeys (macaque) and human subjects that have shown direct connections between auditory and visual areas including primary cortices (V1 and A1)^54,60. Similar studies have also reported projections to/from somatosensory cortex from/to auditory and visual areas⁶⁶. The heteromodal connections between the primary somatosensory cortex and the primary auditory cortex were also reported in the gerbils⁶⁷. In marmosets, projections from the retroinsular area of the somatosensory cortex to the caudiomedial belt auditory area were also observed⁶³ in line with a similar observation in the Old World monkey⁶⁸. Third category of models excludes direct connections between the unimodal areas but considers both a feedforward connection to a multisensory area and a feedback from the multisensory area to the unisensory ones²⁸. Finally, the fourth category of models incorporates all previous connections together, i.e., the feeback and feedforward connections between unimodal and multimodal systems along with the direct connections between the unimodal systems^11,69. Most of these studies do not consider the processing of temporal relationships among the unimodal stimulus components, e.g., AV lags¹⁶, gap duration in speech perceptual categorization³³. Our dynamical model (which comes under the third category) comprising of three Kuramoto oscillators attempts to do so in a minimalist manner. An earlier attempt in this direction was made by Dhamala and colleagues²³ who modelled the rhythmic multisensory paradigm in the audio-visual domain by considering the phase dynamics of two interacting periodic oscillators and investigated the behavioral effects of relative timings of different sensory signals. In their rhythmic paradigm, temporally congruent multisensory stimuli were expected to cause a percept of synchrony. On the other hand, incongruent stimuli could cause a percept of asynchrony or another possible state, the non-phase locked state (drift or neutral percept) because it represented failure in multisensory integration. The non-phase locked state was qualitatively different from the percept of asynchrony. Dhamala and colleagues then used fMRI and simultaneous behavioral recordings to confirm the involvement of a distributed brain network for the multisensory processing of periodic auditory-visual stimuli. The study proposed that the solution space of a hypothetical system of two coupled oscillators corresponded to the perceptual solution space of human multisensory integration and to selected activation of brain regions. Though the authors propose that their results indicate definite involvement of superior colliculus in the perception of synchrony, they have not captured the dynamics of multisensory system as a functional unit directly in the theoretical model, rather emphasizing on the periodicity as the source of categorization. Nonetheless, the rhythmic paradigm is somewhat unrealistic in the context of multisensory speech stimuli. Hence, a broader dynamical framework for multisensory perception is warranted. A key difference between our model and that of Dhamala et al.²³ is that they have not incorporated the time lag or asynchrony between the audio and visual stimuli explicitly in their theoretical model. In our model we capture the AV lag by introducing a time delay τ between the multisensory (AV) and unisensory (V/A) oscillators and study the variation in perceptual stability as a function of this parameter. Subsequently, we propose that the order parameter or the collective variable R of the oscillator system can be representative of overall perceptual categorization.

Multistability is a key aspect of perceptual behavior and in general biological systems⁷⁰. In the McGurk paradigm, presence of responses registered as /ka (the visual stimuli) or fused percepts such as /pa-ka in addition to illusory /ta or auditory /pa can be conceptualized within the framework of multistability. We propose that the different responses registered as /pa, /ta and “others” in our behavioral data can be related to the multistability in the final dynamical state of our coupled oscillator system. The advantage of a dynamical systems approach is that the presence of multistability can be very elegantly explained⁷¹. Detailed analysis of multistability can be extended to psychophysics of speech perception studies³³ in future as well as with other paradigms. As we have shown in the Results section, the presence of time delay makes our dynamical system highly multi-stable. For different choices of initial conditions, the system can go to different stable synchronization states. We propose that the multistability in the final state of the dynamical system can be related to the variation observed in the responses of the participants to same stimuli.

The significant achievement of our model is that it captures the key features of complex multisensory integration processes at the level of behavior and links the structural and functional constraints in the underlying neural systems to ongoing behavioral states. Our results show that our model successfully simulates the temporal constraints on the McGurk effect and the variation in the responses to the same stimuli. Any multisensory behavior can be broken into combinations of individual unisensory and multisensory systems. One can then map the constraints posed by the environmental variables onto the parameter space of a dynamical system and study the emerging perceptual states. For example, recent studies question the idea of multisensory awareness stemming from a mere integration of unisensory systems^13,72. This is a challenge to the traditional way of looking at a multisensory task from the perspective of serial or parallel unisensory processes. Dynamical systems offer an attractive approach to test the environmental constraints on a hypothesized multisensory pathway using modelling, behavior and brain mapping tools. Here, the integration is not identified as a separate process but rather an outcome of the dynamical interactions among unisensory and multisensory neuron populations. In fact the role of factors like attention can be incorporated in the dynamical model by introducing more mathematical complexity, either by coupling parameter or introducing a separate module and presents a scope of future work. Experimentally this will require recording and analyzing eye-tracking data to monitor and parametrize attentional factors. Nonetheless, new experimental paradigms have to be developed to eventually test the predicted outcomes from modelling approaches such as ours.

Additional Information

How to cite this article: Thakur, B. et al. A dynamical framework to relate perceptual variability with multisensory information processing. Sci. Rep. 6, 31280; doi: 10.1038/srep31280 (2016).

References

Gibson, J. J. The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin, (1966).
Calvert, G. A., Spence, C. & Stein, B. E. The Handbook of Multisensory Processes. Cambridge: MA:MIT Press, (2004).
van Wassenhove, V., Grant, K. W. & Poeppel, D. Visual speech speeds up the neural processing of auditory speech. Proc. Natl. Acad. Sci. USA 102, 1181–1186 (2005).
ADS CAS PubMed PubMed Central Google Scholar
Hershenson, M. Reaction time as a measure of intersensory facilitation. J. Exp. Psychol. 63, 289–293 (1962).
CAS PubMed Google Scholar
Stein, B. E., Meredith, M. A., Huneycutt, W. S. & McDade, L. Behavioural indices of multisensory integration: orientation to visual cues is affected by auditory stimuli. J. Cogn. Neurosci. 1, 12–24 (1989).
CAS PubMed Google Scholar
Sumby, W. & Pollack, I. Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215 (1954).
ADS Google Scholar
MacLeod, A. & Summerfield, Q. Quantifying the contribution of vision to speech perception in noise. Br. J. Audiol. 21, 131–141 (1987).
CAS PubMed Google Scholar
Calvert, G. A., Campbell, R. & Brammer, M. J. Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Curr. Biol. 10, 649–657 (2000).
CAS PubMed Google Scholar
Callan, D. E. et al. Neural processes underlying perceptual enhancement by visual speech gestures. NeuroReport 14, 2213–2218 (2003).
PubMed Google Scholar
Stein, B. E., Jiang, W., Wallace, M. T. & Stanford, T. R. Nonvisual influences on visual-information processing in the superior colliculus. Prog. Brain Res. 134, 143–156 (2001).
CAS PubMed Google Scholar
Guipponi, O., Cléry, J., Odouard, S., Wardak, C. & Ben Hamed, S. Whole brain mapping of visual and tactile convergence in the macaque monkey. NeuroImage 117, 93–102 (2015).
Google Scholar
Schulz, M., Ross, B. & Pantev, C. Evidence for training-induced crossmodal reorganization of cortical functions in trumpet players. NeuroReport 14, 157–161 (2003).
PubMed Google Scholar
Deroy, O., Chen, Y. & Spence, C. Multisensory constraints on awareness. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130207 (2014).
PubMed PubMed Central Google Scholar
McGurk, H. & Macdonald, J. Hearing lips and seeing voices. Nature 264, 746–748 (1976).
ADS CAS PubMed Google Scholar
Nath, A. R. & Beauchamp, M. S. Dynamic changes in superior temporal sulcus connectivity during perception of noisy audiovisual speech. J. Neurosci. 31, 1704–1714 (2011).
CAS PubMed PubMed Central Google Scholar
Munhall, K., Gribble, P., Sacco, L. & Ward, M. Temporal constraints on the McGurk effect. Atten., Percept. and Psychophys. 58, 351–362 (1996).
CAS Google Scholar
Keil, J., Müller, N., Ihssen, N. & Weisz, N. On the variability of the McGurk effect: audiovisual integration depends on prestimulus brain states. Cereb. Cortex 22, 221–231 (2012).
PubMed Google Scholar
de Gelder, B. & Bertelson, P. Multisensory integration, perception and ecological validity. Trends Cogn. Sci. 7, 460–467 (2003).
PubMed Google Scholar
Macaluso, E., Frith, C. D. & Driver, J. Modulation of human visual cortex by crossmodal spatial attention. Science 289, 1206–1208 (2000).
ADS CAS PubMed Google Scholar
Kopinska, A. & Harris, L. R. Simultaneity constancy. Perception 33, 1049–1060 (2004).
PubMed Google Scholar
Massaro, D. W. & Cohen, M. M. Perceiving asynchronous bimodal speech in consonant-vowel and vowel syllables. J. Acoust. Soc. Am. 13, 127–134 (1993).
Google Scholar
Massaro, D. W. & Cohen, M. M. Perception of asynchronous and conflicting visual and auditory speech. Speech Commun. 100, 1777–1786 (1996).
CAS Google Scholar
Dhamala, M., Assisi, C. G., Jirsa, V. K., Steinberg, F. L. & Kelso, J. A. Multisensory integration for timing engages different brain networks. NeuroImage 34, 764–773 (2007).
PubMed Google Scholar
Körding, K. P. et al. Causal inference in multisensory perception. PLoS One 2, e943 (2007).
ADS PubMed PubMed Central Google Scholar
Macaluso, E. Spatial Constraints in Multisensory Attention. In: Murray, M. M., Wallace, M. T., (Eds.), The Neural Bases of Multisensory Processes. Boca Raton (FL) : CRC Press/Taylor & Francis; (2012). Chapter 25.
Olasagasti, I., Bouton, S. & Giraud, A. Prediction across sensory modalities: A neurocomputational model of the McGurk effect. Cortex 68, 61–75 (2015).
PubMed Google Scholar
Cuppini, C., Magosso, E., Rowland, B., Stein, B. & Ursino, M. Hebbian mechanisms help explain development of multisensory integration in the superior colliculus: a neural network model. Biol. Cybern. 106, 691–713 (2012).
MathSciNet CAS PubMed PubMed Central MATH Google Scholar
Magosso, E., Cuppini, C., Serino, A., Pellegrino, G. D. & Ursino, M. A theoretical study of multisensory integration in the superior colliculus by a neural network model. Neural Networks 21, 817–829 (2008).
PubMed MATH Google Scholar
Magosso, E., Cuppini, C. & Ursino, M. A Neural Network Model of Ventriloquism Effect and Aftereffect. PLoS One 7, e42503 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Cuppini, C., Magosso, E., Bolognini, N., Vallar, G. & Ursino, M. A neurocomputational analysis of the sound-induced flash illusion. NeuroImage 92, 248–266 (2014).
PubMed Google Scholar
Meredith, M. A. & Stein, B. E. Interactions among converging sensory inputs in the superior colliculus. Science 221, 389–391 (1983).
ADS CAS PubMed Google Scholar
Ursino, M., Cuppini, C. & Magosso, E. Neurocomputational approaches to modelling multisensory integration in the brain: A review. Neural Networks 60, 141–165 (2014).
PubMed MATH Google Scholar
Tuller, B., Case, P., Ding, M. & Kelso, J. A. S. The non-linear dynamics of speech categorization. J of Exp psychol: Human percept. & performance 60(1) 3–16 (1994).
Google Scholar
Senkowski, D., Schneider, T. R., Foxe, J. J. & Engel, A. E. Crossmodal binding through neural coherence: Implications for multisensory processing. Trends Neurosci. 31, 401–409 (2008).
CAS PubMed Google Scholar
Hipp, J. F., Engel, A., K. & Siegel, M. Oscillatory synchronization in largescale cortical networks predicts perception. Neuron 69(2), 387–396 (2011).
CAS PubMed Google Scholar
Schroeder, C. E., Lakatos, P., Kajikawa, Y., Partan, S. & Puce, A. Neuronal oscillations and visual amplification of speech. Trends Cogn. Sci. 12, 106–113 (2008).
PubMed PubMed Central Google Scholar
Giraud, A. & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012).
CAS PubMed PubMed Central Google Scholar
von Stein, A., Rappelsberge, r. P., Sarnthein, J. & Petsche, H. Synchronization between temporal and parietal cortex during multimodal object processing in man. Cereb. Cortex 9, 137–150 (1999).
CAS PubMed Google Scholar
Sakowitz, O. W., Quian Quiroga R., Schurmann, M. & Basar, E. Spatiotemporal frequency characteristics of intersensory components in audiovisually evoked potentials. Brain Res Cogn Brain Res 23, 316–326 (2005).
PubMed Google Scholar
Doesburg, S. M., Emberson, L. L., Rahi, A., Cameron, D. & Ward, L. M. Asynchrony from synchrony: long-range gamma-band neural synchrony accompanies perception of audiovisual speech asynchrony. Exp. Brain Res. 185, 11–20 (2008).
PubMed Google Scholar
Luo, H. & Poeppel, D. Cortical oscillations in auditory Perception and speech: Evidence for two temporal windows in human auditory cortex. Front Psychol 3, 170 (2012).
PubMed PubMed Central Google Scholar
Park, H., Kayser, C., Thut, G. & Gross, J. Cortical lip movements entrain the observer’s low-frequency brain oscillations to facilitate speech intelligibility eLife 5, e14521 (2016).
PubMed PubMed Central Google Scholar
Ganesh, A. C., Berthommier, F., Vilain, C., Sato, M. & Schwartz, J. L. A possible neurophysiological correlate of audiovisual binding and unbinding in speech perception Front. Psychol. 5, 1340 (2014).
PubMed PubMed Central Google Scholar
Kuramoto Y. Chemical Oscillations, Waves and Turbulence. New York, NY: Springer-Verlag, (1984).
Breakspear, M., Heitmann, S. & Daffertshofer, A. Generative Models of Cortical Oscillations: Neurobiological Implications of the Kuramoto model. Front Hum Neurosci 4, 1–14 (2010).
Google Scholar
Cabral, J., Hugues, E., Sporns, O. & Deco, G. Role of local network oscillations in resting-state functional connectivity. NeuroImage 4, 1–11 (2011).
Google Scholar
Schmidt, H., Petkov, G., Richardson, M. P. & Terry, J. R. Dynamics on networks: the role of local dynamics and global networks on the emergence of hypersynchronous neural activity. PLoS Comput Biol. 10, e1003947 (2014).
ADS PubMed PubMed Central Google Scholar
Banerjee, A. & Jirsa, V. K. How do neural connectivity and time delays influence bimanual coordination? Biol. Cybern. 96, 265–278 (2007).
MathSciNet PubMed MATH Google Scholar
Cappe, C., Thut, G., Romei, V. & Murray, M. M. Auditory-visual multisensory interactions in humans: timing, topography, directionality and sources. J. Neurosci. 30, 12572–12580 (2010).
CAS PubMed PubMed Central Google Scholar
Jirsa, V. K. & Ding, M. Will a large complex system with time delays be stable? Phys. Rev. Lett. 93, 070602 (2004).
ADS PubMed Google Scholar
Meredith, M. A. & Stein, B. E. Descending efferents from the superior colliculus relay integrated multisensory information. Science 227, 657–659 (1985).
ADS CAS PubMed Google Scholar
Keil, J., Müller, N., Hartmann, T. & Weisz, N. Prestimulus beta power and phase synchrony influence the sound-induced flash illusion. Cereb. Cortex 24, 1278–1288 (2014).
PubMed Google Scholar
Acebrón, J. A., Bonilla, L. L., Pérez Vicente, C. J., Ritort, F. & Spigler, R. The Kuramoto model: A simple paradigm for synchronization phenomena. Rev. Mod. Phys. 77, 137–185 (2005).
ADS Google Scholar
Falchier, A., Clavagnier, S., Barone, P. & Kennedy, H. Anatomical Evidence of Multimodal Integration in Primate Striate Cortex. The Journal of Neuroscience 22, 5749–5759 (2002).
CAS PubMed PubMed Central Google Scholar
Peck, C. K. Visual-auditory integration in cat superior colliculus: implications for neuronal control of the orienting response. Prog. Brain Res. 112, 167–177 (1996).
CAS PubMed Google Scholar
Powers, A. R., Hevey, M. A. & Wallace, M. T. Neural correlates of multisensory perceptual learning. J. Neurosci. 32, 6263–6274 (2012).
CAS PubMed PubMed Central Google Scholar
Kassuba, T., Menz, M. M., Röder, B. & Siebner, H. R. Multisensory interactions between auditory and haptic object recognition. Cereb. Cortex. 23, 1097–1107 (2013).
PubMed Google Scholar
Okada, K., Venezia, J. H., Matchin, W., Saberi, K. & Hickok, G. An fMRI Study of Audiovisual Speech Perception Reveals Multisensory Interactions in Auditory Cortex. PLoS One 8, e68959 (2013).
ADS CAS PubMed PubMed Central Google Scholar
Watson, R. et al. Crossmodal adaptation in right posterior superior temporal sulcus during face-voice emotional integration. J. Neurosci. 34, 6813–6821 (2014).
CAS PubMed PubMed Central Google Scholar
Rockland, K. S. & Ojima, H. Multisensory convergence in calcarine visual areas in macaque monkey. International Journal of Psychophysiology 50, 19–26 (2003).
PubMed Google Scholar
Wallace, M. T., Ramachandran, R. & Stein, B. E. A revised view of sensory cortical parcellation. PNAS 101, 2167–2172 (2004).
ADS CAS PubMed PubMed Central Google Scholar
Ernst, M. O. & Banks, M. S. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415, 429–433 (2002).
ADS CAS PubMed Google Scholar
de la Mothe, L. A., Blumell, S., Kajikawa, Y. & Hackett, T. A. Cortical connections of the auditory cortex in marmoset monkeys: core and medial belt regions. J. Comp. Neurol. 496, 27–71 (2006).
PubMed Google Scholar
van Atteveldt, N., Roebroeck, A. & Goebel, R. Interaction of speech and script in human auditory cortex: insights from neuro-imaging and effective connectivity. Hear Res. 258, 152–164 (2009).
PubMed Google Scholar
Rowland, B. A., Stanford, T. R. & Stein, B. E. A model of the neural mechanisms underlying multisensory integration in the superior colliculus. Perception 36, 1431–1443 (2007).
PubMed Google Scholar
Cappe, C. & Barone, P. Heteromodal connections supporting multisensory integration at low levels of cortical processing in the monkey. European Journal of Neuroscience 22, 2886–2902 (2005).
Google Scholar
Budinger, E., Heil, P., Hess, A. & Scheich, H. Multisensory processing via early cortical stages: Connections of the primary auditory cortical field with other sensory systems. Neuroscience 143, 1065–1083 (2006).
CAS PubMed Google Scholar
Smiley, J. F. et al. Multisensory convergence in auditory cortex, I. cortical connections of the caudal superior temporal plane in macaque monkeys. The Journal of Comparative neurology 502, 894–923 (2007).
PubMed Google Scholar
Hoshino, O. Neuronal responses below firing threshold for subthreshold cross-modal enhancement. Neural Computation 23, 958–983 (2011).
PubMed Google Scholar
Pastukhov, A. et al. Multi-stable perception balances stability and sensitivity. Front. Comput. Neurosci. 7, 17 (2013).
PubMed PubMed Central Google Scholar
Freyer, F., Roberts, J. A., Ritter, P. & Breakspear, M. A canonical model of multistability and scale-invariance in biological systems. PLoS Comput. Biol. 8, e1002634 (2012).
ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Lakatos, P., Chen, C., O’Connel, M. N., Mills, A. & Schroeder, C. E. Neuronal Oscillations and Multisensory Interaction in Primary Auditory Cortex. Neuron 53, 279–292 (2007).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The study was supported by IPR and NBRC Core support staff and infrastructure. Ramalingaswami fellowship, DBT (BT/RLF/Re-entry/31/2011) and Innovative Young Bio-technologist Award (IYBA), DBT (BT/07/IYBA/2013) from the Department of Biotechnology (DBT), Ministry of Science and Technology Government of India to Arpan Banerjee(AB). AB also acknowledges the support of Centre of Excellence in Epilepsy and MEG (BT/01/COE/09/08/2011) from DBT and Dr. Dipanjan Roy (IIIT-Hyderabad) for helpful comments on the manuscript.

Author information

Authors and Affiliations

Institute for Plasma Research (IPR), Bhat, Gandhinagar, 382428, Gujarat, India
Bhumika Thakur & Abhijit Sen
National Brain Research Centre (NBRC), NH 8, Manesar, Gurgaon, 122051, Haryana, India
Abhishek Mukherjee & Arpan Banerjee

Authors

Bhumika Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Abhishek Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Abhijit Sen
View author publications
You can also search for this author in PubMed Google Scholar
Arpan Banerjee
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

B.T., A.S. and A.B. conceived the study; A.M. and A.B. collected and analysed the behavioral data; B.T., A.S. and A.B. did the modelling work; B.T., A.M., A.S. and A.B. prepared the figures; B.T. and A.B. wrote the manuscript; B.T., A.M., A.S. and A.B. read and commented on the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Thakur, B., Mukherjee, A., Sen, A. et al. A dynamical framework to relate perceptual variability with multisensory information processing. Sci Rep 6, 31280 (2016). https://doi.org/10.1038/srep31280

Download citation

Received: 08 February 2016
Accepted: 15 July 2016
Published: 09 August 2016
DOI: https://doi.org/10.1038/srep31280

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.