The additive nature of the human multisensory evoked pupil response

Van der Stoep, Nathan; Van der Smagt, M. J.; Notaro, C.; Spock, Z.; Naber, M.

doi:10.1038/s41598-020-80286-1

Download PDF

Article
Open access
Published: 12 January 2021

The additive nature of the human multisensory evoked pupil response

Scientific Reports volume 11, Article number: 707 (2021) Cite this article

1906 Accesses
13 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Pupillometry has received increased interest for its usefulness in measuring various sensory processes as an alternative to behavioural assessments. This is also apparent for multisensory investigations. Studies of the multisensory pupil response, however, have produced conflicting results. Some studies observed super-additive multisensory pupil responses, indicative of multisensory integration (MSI). Others observed additive multisensory pupil responses even though reaction time (RT) measures were indicative of MSI. Therefore, in the present study, we investigated the nature of the multisensory pupil response by combining methodological approaches of previous studies while using supra-threshold stimuli only. In two experiments we presented auditory and visual stimuli to observers that evoked a(n) (onset) response (be it constriction or dilation) in a simple detection task and a change detection task. In both experiments, the RT data indicated MSI as shown by race model inequality violation. Still, the multisensory pupil response in both experiments could best be explained by linear summation of the unisensory pupil responses. We conclude that the multisensory pupil response for supra-threshold stimuli is additive in nature and cannot be used as a measure of MSI, as only a departure from additivity can unequivocally demonstrate an interaction between the senses.

Assessing perceptual chromatic equiluminance using a reflexive pupillary response

Article Open access 29 January 2024

A pupillary index of susceptibility to decision biases

Article 04 January 2021

The effects of emotional arousal on pupil size depend on luminance

Article Open access 19 September 2024

Introduction

Spatial orienting is an important function of the human brain that allows us to efficiently perceive and act upon the world around us. For example, using our senses, we can quickly determine the location of an approaching car when crossing the street and adjust our actions accordingly. Our senses provide both unique and redundant information about the environment. For example, whereas both vision and hearing provide spatial information about events in the region of space in front of the body, hearing provides information about events outside of the field of view^1,2,3 and vision generally allows for more accurate and precise localization of information than audition^4,5. It is well established that the human brain can integrate information obtained via different senses, resulting in faster, more accurate, and precise orienting behaviour^{5,6,7,8,9,10,11,12}. For example, audiovisual events can be detected more quickly and attracts gaze more rapidly than a purely visual or auditory event due to multisensory integration (MSI, e.g.⁵).

Various studies have demonstrated that, among others, the superior colliculus (SC, a subcortical structure) is important for integrating sensory input and generating eye-movements to unisensory and multisensory events^13,14,15,16. The SC contains multisensory neurons that respond to input from different sensory modalities and contribute to the multisensory enhancement of, among others, orienting behaviour. The underlying neural computation of these multisensory neural responses has been characterized as linear (additive: equal to the sum of the unisensory responses), or non-linear (i.e. sub-additive: less than the sum, or super-additive: larger than the sum;¹⁷). Additionally, the SC is involved in transient changes in pupil size¹⁸. It has been suggested that the pupil’s response to sensory events plays an important role in orienting responses as it is modulated by saliency, focused spatial attention, and motor coordination^19,20,21.

Given the role of the SC in MSI, spatial attention, and pupil responses, it may come as no surprise that researchers have started investigating the nature of the multisensory pupil response. A central question in many multisensory studies, which is no different in the case of the multisensory pupil response, is whether the observed multisensory behaviour is different from response to unisensory stimuli (e.g. sound or light alone). In addition, a comparison between the multisensory and the sum of unisensory responses often provides insights into the particular computation driving the multisensory behaviour as deviations from the sum of the unisensory responses can be used as a strict criterion for multisensory integration²².

Whether the multisensory pupil response is linear or non-linear is not trivial, as the interpretation regarding the occurrence of MSI when measuring behaviour typically depends on this outcome. In some cases, knowing whether MSI caused a certain behavioural outcome is not only relevant from a fundamental perspective, it could also be relevant for clinical applications. For example, it has been argued that being able to measure MSI using pupil response measures is especially advantageous in patient populations in which, for example, classic response time measures of MSI cannot be used (e.g.²³). If the multisensory (pupil) response is larger (or smaller) than the linear sum of the unisensory (pupil) responses, then one can generally draw the conclusion that the multisensory response is driven by integrated sensory input and that patients can integrate sensory input. However, when the multisensory pupil response is additive, there is no behavioural evidence that indicates that MSI is the driving factor behind the multisensory pupil response. In most cases, the most parsimonious explanation would then be that the observed multisensory behaviour is the result of the independent processing of sensory input (see²⁴).

So far, however, conflicting results have been reported with regard to the nature of the multisensory pupil response, which casts some doubt on whether and how the multisensory pupil response is driven or modulated by MSI. For example, previous research in monkeys has shown that the multisensory pupil response to audiovisual events is similar to the sum of the unisensory pupil responses (²⁵; also see Fig. 3B showing a mixture of sub- and super-additive multisensory responses and Fig. S6B indicating (sub-)additivity in²⁶). However, a more recent study in humans suggests the multisensory pupil response to be super-additive²³. Thus, evidence regarding the nature of multisensory pupil response points in different directions. Whereas there is support for the notion that the multisensory pupil response is larger than the sum of the unisensory pupil responses²³, other studies have shown that the multisensory pupil response is additive or sub-additive^25,26.

Given these conflicting findings in previous studies, we investigated the nature of the human multisensory pupil response in more detail using (1) two different behavioural paradigms and (2) multiple types of visual stimuli that evoke opposite pupillary responses, and (3) only supra-threshold stimuli that evoke robust behavioural and physiological responses. In previous studies, stimulus events were either characterized by sudden onsets (²⁵; our Experiment 1) or a visual stimulus changing in form (²³; our Experiment 2). We therefore tested for MSI at a behavioural level by measuring both RTs and pupil responses in a simple detection paradigm (Experiment 1) and a visual change-detection paradigm (Experiment 2). We checked for MSI in RTs by testing for race model inequality violations^10,27,28,29 and for super/sub-additivity by comparing the multisensory pupil response to the sum of the unisensory pupil responses. Second, the multisensory stimuli used in previous studies consisted of the combination of visual stimuli that either evoked pupil constriction or dilation (cf.^23,30). Therefore, in Experiment 1, all participants were presented both with visual stimuli that either evoked a pupil dilation or constriction, auditory stimuli that evoked a pupil dilation, and the combination of these stimuli. If the multisensory pupil response is super-additive or sub-additive, then one would expect that the multisensory pupil response is larger or smaller than the sum of the unisensory pupil responses, independent of the direction of the pupil response and type of visual event. However, if the multisensory pupil response is additive, then we cannot conclude whether the multisensory pupil response to supra-threshold stimuli is driven by MSI as it can also be explained by independent processing of sensory input. The impact of the type of task was investigated by using a change detection task similar to²³ in Experiment 2.

Results

Experiment 1

Twelve participants were tested In Experiment 1 (see Fig. 1, left panel). Participants took part in a response and a no response block. They were instructed to respond as fast as possible to the onset of a sound or light in the response block and to only passively observe the stimuli in the no response block. In both types of blocks pupil responses were recorded.

Hits

The proportion of hits was very high in all conditions (range across all conditions = 0.92–1). Therefore, we did not analyse the proportion of hits further.

Response time data

Response times were only collected in the response block.

Response times

It was first determined whether participants responded faster to multisensory stimuli than unisensory stimuli. This was indeed the case, both for bright and dark stimuli (Fig. 2A). A Bayesian repeated measures Analysis of Variance (ANOVA) for RTs in the dark target condition indicated very strong evidence for an effect of Target Modality compared to a null model assuming no effect (A, V_Dark, AV_Dark; BF₁₀ = 625,233; Note that BF₁₀ indicates evidence in favour of the alternative hypothesis and BF₀₁ evidence in favour of the null-hypothesis). Post-hoc tests corrected for multiple testing indicated that responses in the AV_Dark condition (M = 249 ms, SD = 40) were faster than in the V_Dark (M = 285 ms, SD = 47, BF₁₀ = 2356) and A condition (M = 319 ms, SD = 57, BF₁₀ = 9908). Responses in the V_Dark condition were also faster than responses in the A condition (BF₁₀ = 22.545).

Another Bayesian repeated measures ANOVA was conducted using the bright target conditions (A, V_Bright and AV_Bright). Again, there was very strong evidence for a main effect of Target Modality (BF₁₀ = 153,882). Responses in the AV_Bright (M = 254, SD = 37) condition were faster than in the V_Bright condition (M = 279, SD = 45, BF₁₀ = 1978) and the A condition (M = 319 ms, SD = 57, BF₁₀ = 1625). RTs in the V_Bright condition were significantly shorter than in the A condition (BF₁₀ = 62.468).

Multisensory response enhancement

To determine the amount of speed-up in the multisensory condition relative to the fastest unisensory condition, the amount of multisensory response enhancement (MRE) was analysed (grey area Fig. 2B, see “Methods” section for more information). In line with the RT analysis, Bayesian one-sample t-tests provided very strong evidence of MRE being larger than zero in the bright (M = 30 ms, SD = 13, BF₁₀ = 3164, δ = 2.08) and dark AV target condition (M = 39 ms, SD = 17, BF₁₀ = 3285, median δ = 2.09). There was only anecdotal evidence for a difference in the amount of MRE between the dark and bright condition (BF₁₀ = 2.469, δ = 0.602, see Fig. 2D).

Race model inequality violation

Responses to multisensory stimuli can become faster than responses to unisensory stimuli due to statistical facilitation (i.e. independent processing of sensory input). That is, the probability of fast responses simply increases when a participant can respond to sound or light when they are presented together and the observed unisensory response time distributions overlap. To investigate whether the observed MRE could be explained by statistical facilitation, the cumulative response time distribution in the multisensory condition (the blue line in Fig. 2C) was compared to the sum of the unisensory cumulative RT distributions (the race model, the black line in Fig. 2C, see “Methods” section for more information). If responses in the multisensory condition are faster than the upper limit of the race model, the race model inequality (RMI, see “Methods” section) is violated. This means that multisensory response enhancement cannot be explained by independent processing of sensory input, which is indicative of MSI (though see^11,12,29 for considerations). Bayesian one-sided one-sample t-tests indicated there was strong evidence for the amount of RMI violation (i.e. the grey violation area in Fig. 2C) being larger than zero in both the Dark and Bright condition (see Fig. 2E). RMI violation was observed both for dark (M = 11.25 ms, SD = 6.384, BF₊₀ = 725, median δ = 1.565) and bright AV targets (M = 8.25 ms, SD = 5.512, BF₊₀ = 223, median δ = 1.316). There was no evidence for or against a difference in violation area between the Dark and Bright condition (BF₁₀ = 0.651, median δ = 0.337).

Overall, these results are indicative of MSI in the Bright and Dark target condition as the observed multisensory response enhancement cannot simply be explained by independent processing of sensory input.

Pupillometry data

To investigate the nature of the multisensory pupil response, the pupil response to AV targets was compared to the sum, and in the Response block also to the corrected sum, of the unisensory pupil responses (see Fig. 3). The corrected sum is the sum of the pupil responses to auditory (A) and visual (V) stimuli, from which the pupil response related to a button press was subtracted. This in order to more fairly compare the AV and summed pupil responses. The amount of subtraction (i.e. correction) was calculated by subtracting the pupil response in the no response from the response block for auditory stimuli (see “Methods” section and Supplementary Fig. 1).

Bayesian paired samples t-tests were conducted to test for differences in the area under the curve (AUC) of the pupil response from 500 to 1200 ms after stimulus onset between the AV condition, the sum (A + V), and the corrected sum (see Fig. 3).

In the Response block, the pupil response to AV stimuli showed subadditivity when compared to the sum of A and V both for dark targets (BF₁₀ = 7688, median δ = − 2.317) and bright targets (BF₁₀ = 88, median δ = − 1.268). This could be interpreted as a deviation from additivity (i.e. sub- or super-additivity, see Fig. 3A,B). However, when the pupil response to AV targets was compared to the corrected sum (A + V – (A_Response – A_{No response})), there was no difference between the AV and the corrected sum of the unisensory pupil responses in both the dark (BF₀₁ = 3.092, median δ = − 0.122) and bright condition (BF₀₁ = 3.377, median δ = − 0.061, compare the purple dashed line and the blue line in Fig. 3A,B, and see the AV versus cSum comparisons in Fig. 3C). This demonstrates additivity of the AV pupil response for both dark and bright AV targets in the response block. Similarly, the results from the no response block indicate that the pupil response to AV targets was more likely to be similar to the (in this case uncorrected) sum of the unisensory pupil responses for both dark (BF₀₁ = 2.971, median δ = − 0.141) and bright targets than different from the sum (BF₀₁ = 3.463, median δ = − 0.024, compare the black dashed line and the blue line in Fig. 3D,E, and see the AV versus Sum comparisons in Fig. 3F; see Supplementary Figures S1–S3, for the same figures but corrected for the motor component in the visual rather than auditory pupil response; see Supplementary Figure S4 and S5 for the sequential analyses).

Discussion Experiment 1

In this first Experiment, the response time analysis in the response block suggests that MSI had occurred with these stimuli as shown by significant RMI violation. We also observed that the multisensory pupil response was different from the unisensory pupil responses but always equal to the (corrected) sum of the unisensory pupil responses. This indicates that the multisensory pupil response is additive in nature, both when targets had to be responded to and when passively viewed. This is true both for pupil dilation and constriction responses. This means that although the response time behaviour seems to be driven by MSI, the multisensory pupil response can be explained by summation of the independent effects of unisensory signals. Based upon these results one could argue that, given that the multisensory pupil response cannot be distinguished from the sum of the unisensory responses, the multisensory pupil response may not be a good candidate for measuring MSI in populations for which RT data collection is not feasible. However, although the simple detection task used in Experiment 1 is a classic paradigm for measuring MSI, it is different from the paradigm used in a recent study that did show super-additivity of the multisensory pupil response²³. Thus, to further investigate the nature of the multisensory pupil response, we used a change-detection paradigm in Experiment 2 that was similar to that of Rigato et al.²³. Yet, in their study, pupil responses and response times were measured in different blocks and never together in the same block, while in Experiment 2, we again used a response and a no response block and measure pupil changes and response times simultaneously in the response block.

Experiment 2

In this second experiment, twelve participants were tested in a response and a no response block while their pupil size was recorded (see Fig. 1, right panel). In the response block, they were instructed to respond as fast as possible to a change in the shape of a visual stimulus or the onset of an auditory stimulus, and withhold their response when no visual change nor a sound was presented. In the no response block they were instructed to passively view and listen to the stimuli.

Hits

The proportion of hits was very high in all conditions (range across all conditions = 0.875–1). The average proportion of correctly withheld responses during catch trials was 1 (SD = 0).

Response time data

Response times

The average of the median RTs in each condition is shown Fig. 4A. A Bayesian repeated measures ANOVA for RTs indicated there was very strong evidence for a main effect of Target Modality (BF₁₀ = 4.107E+6) compared to a null model assuming no effect. Bayesian post-hoc tests corrected for multiple testing indicated that responses in the AV condition (M = 301 ms, SD = 32) were significantly faster than in the V (M = 389 ms, SD = 49, BF₁₀ = 28,558) and A condition (M = 352 ms, SD = 47, BF₁₀ = 1202). Responses in the A condition were faster than responses in the V condition (BF₁₀ = 25).

Multisensory response enhancement

There was strong evidence that the average amount of MRE was significantly different from zero (M = 57 ms, SD = 20, BF₁₀ = 17,035, median δ = 2.546, see Fig. 4B).

Race model inequality violation

To test whether the observed MRE could be explained by statistical facilitation, the amount of RMI violation was compared to zero using a one-sided Bayesian one-sample t-test. As expected, there was strong evidence of violation of the RMI (M violation = 25 ms, SD = 17, BF₊₀ = 200, median δ = 1.294), indicative of an interaction between the senses (see Fig. 4C).

Pupillometry data

Bayesian paired-samples t-tests on the AUC between 1255 and 1948 ms of the pupil trace post stimulus onset (the significant time window in²³ indicated that that there was strong evidence that the audiovisual pupil response in the response block was smaller than the sum of the unisensory pupil responses (BF₁₀ = 15, median δ = − 0.927, compare the black dashed line and the blue line in Fig. 5A). The pupil response in the AV condition was more likely to be similar to the corrected sum of the unisensory responses than different (BF₀₁ = 2.836, median δ = − 0.161, compare the purple dashed line and the blue line in Fig. 5A and see Fig. 5B). In the no response block, the audiovisual pupil response was more likely to be similar to, rather than different from, the sum of the unisensory responses (Bayesian Wilcoxon Signed-Rank test due to violation of normality, BF₀₁ = 1.64, median δ = − 0.260, compare the black dashed line and the blue line in Fig. 5C and see Fig. 5D; see Supplementary Figs. S6 and S7, for the same figures but corrected for the motor component in the visual rather than auditory pupil response; see Supplementary Fig. S8 for the sequential analyses).

Overall, the results from Experiment 2 also indicate that the multisensory pupil response was additive in nature.

General discussion

The integration of sensory information from multiple senses is known to lead to a plethora of behavioural benefits such as faster detection and more accurate perception of stimuli^3,32,33. In addition to changes in overt behaviour in response to multisensory stimuli, many studies have shown differences in the (neuro)physiological responses to multisensory and unisensory stimulation (e.g.³⁴). Given the important role of the superior colliculus in multisensory integration, spatial orienting, and pupil dilation^{13,14,15,16,18}, we set out to investigate the nature of the multisensory pupil response. In previous studies, different effects of multisensory stimulation on the pupil response have been observed. Whereas in some studies it was observed that the multisensory pupil response is equal to the sum of the unisensory pupil responses²⁵, in another study super-additive multisensory pupil responses were observed²³. There were several differences between these studies that may have contributed to the contrasting findings. Here, we not only used two paradigms that were previously used in these studies, but also visual stimuli that evoked pupil constriction or dilation to further investigate the nature of the multisensory pupil response. More specifically, we investigated whether the multisensory pupil response was similar to or different from the sum of the unisensory pupil responses. Our findings show that the multisensory pupil response is different from either of the unisensory pupil responses, but equal to the linear sum of the unisensory pupil responses in all experiments and conditions when supra-threshold stimuli are used.

Although multisensory neurons in the superior colliculus may potentially have contributed to the observed multisensory pupil response, Occam’s razor would lead us to conclude that the observed multisensory pupil responses reflect the sum of independent sensory processes. Unfortunately, the link between multisensory neuronal activity and multisensory behaviour is not straightforward, which complicates this argument. Multisensory neurons vary in their response properties, and the response of a single multisensory neuron can vary from sub-additive, to additive, and super-additive depending on the stimulus effectiveness³⁵. It is unclear how the activity of all multisensory neurons in the SC together drive multisensory behaviour. If the average response of multisensory neurons in the SC is additive this might still lead to multisensory facilitation of motor responses and to an additive multisensory pupil response. However, from a behavioural point of view, we can only check whether behaviour can be described by the linear sum of unisensory pupil responses or not. If the multisensory pupil response is indeed the same as the sum of the unisensory pupil response, then this most likely reflects independent effects of the senses. Knowing whether the multisensory pupil response is driven by MSI is not trivial: For behavioural measurements, only a deviation from additivity allows a conclusive argument that multisensory integration has occurred (and thus whether the pupil response can be used as a measure of MSI). We did find support for MSI in terms of the speed of responding to the same multisensory stimuli that evoked the pupil responses as evidenced by violations of the race model inequality. This was true both for audiovisual targets containing a bright and a dark visual stimulus relative to the stimulus background. Our observation that the multisensory pupil response to supra-threshold stimuli is additive in nature is in line with previous observations in monkeys^25,26, but in contrast with Rigato et al.²³.

The difference between our results and Rigato et al.’s²³, in which a super-additive multisensory pupil response was observed, could be explained by the paradigm that was used (this only applies to our Experiment 1), the intensity of the stimuli (high vs. low,the exact intensities were not reported by Rigato et al. though), and the type of sounds (white noise vs. pure tones). The most likely cause of the different results is the intensity of the stimuli used. It has been shown that the relative increase in a multisensory neuron’s response to multisensory input, as compared to the same neuron’s response to unisensory stimulation, is greater the weaker those unisensory responses are (see e.g., Fig. S8 in³⁵). Using near-threshold stimuli may result in super-additive multisensory pupil responses. It might be important to use near-threshold stimuli for all stimuli involved (e.g. sound and light) as Wang et al.²⁶ used relatively low intensity auditory stimuli and did not find super-additive multisensory pupil responses (see²⁶, Fig. 3A; though additivity was not formally assessed). Using “intermediate” visual contrasts, Wang et al.²⁵ also did not observe super-additivity. How stimulus contrast (and thus effectiveness, and stimulus processing time) influences the nature of the multisensory pupil response requires a more thorough investigation using multiple stimulus intensities. Additionally, given the importance of spatial and temporal proximity of audiovisual stimuli for MSI, manipulating spatial and temporal alignment (i.e., a difference in auditory and visual stimulus location and timing) could be helpful in furthering our understanding of the processes that drive the multisensory pupil response (see for example²⁵).

An alternative explanation for the absence of super-additivity of the multisensory pupil response to supra-threshold stimuli in the current study could be the presence of a ceiling effect. The pupil response to a unisensory stimulus could already be close to or at the maximum dilation, not allowing a stronger dilation for multisensory stimuli. In that case, if the multisensory pupil response is super-additive in nature, observing a multisensory pupil response that is smaller than the sum of the unisensory responses could lead to the incorrect conclusion that the multisensory pupil response is sub-additive or additive in nature. However, we are not concerned about such a ceiling effect in our study for two reasons: (1) We observed additivity both in the response blocks (which contained larger pupil dilations than in the no response block), and in the no response blocks (which contained much smaller pupil responses). If a ceiling effect was a problem in the response block than certainly not in the no response block in Experiment 1; (2) We observed additivity both when audiovisual targets contained a visual component stimulus that evoked pupil dilation and when the visual component stimulus evoked pupil constriction. When the auditory stimulus triggers pupil dilation and the visual stimulus pupil constriction, the multisensory pupil does not have to exceed the visual or auditory evoked unisensory pupil response to be non-linear in nature. Therefore, we do not think that the observed additivity is due to a ceiling or floor effect.

The observed multisensory response enhancement and race model inequality violation in Experiment 1 and 2 is in line with previous studies of multisensory integration using response times as a dependent variable^{10,36,37,38,39,40}. Although multisensory response enhancement can be the result of various processes (response preparation; cross-modal spatial attention; multisensory integration; switch costs, see^{12,37,41,42,43,44}), the paradigm used in the current study makes is very likely that the observed RT effects are due to an interaction between the senses. Whether or not the multisensory response enhancement effects observed in the lab reflect sensory processing in more complex situations in daily life remains to be seen. For example, Corneil et al.⁴⁵ investigated the influence of scene complexity on multisensory response enhancement of saccades and showed that the spatial properties of saccades are mainly driven by visual input and the temporal properties of the saccades by auditory input. These results may suggest that multisensory response enhancement is not always present in all circumstances. With regard to manual responses, however, it has been shown that race model inequality violation is quite robust and not affected by stimulus and environment complexity in virtual reality⁴⁶. It could be interesting to see how the multisensory pupil response is affected by scene complexity and whether this would change the nature of the multisensory pupil response.

In sum, the current study sheds more light on the nature of the multisensory pupil response. Based on the current findings from two experiments using two different paradigms, we conclude that the multisensory pupil response is additive in nature for supra-threshold stimuli. The multisensory pupil response, although clearly different from the unisensory responses, is equal to the sum of the unisensory pupil responses in two different paradigms with response time analyses indicating an interaction between the senses. Pupil responses have been related to many phenomena ranging from changes in low-level stimulus properties such as brightness to more cognitive influences such as visual attention. Given that the multisensory pupil response can most easily be explained by linear summation of independent sensory input, while the reaction time measures are indicative of multisensory integration, it appears unwise to use pupillometry as a measure for MSI in populations (such as infants) that are unable to give a behavioural response. The intermediate layers of the Superior Colliculus are known to be involved in MSI^16,47 and have been linked to transient changes in pupil dilation³⁰, making this brainstem area a likely candidate for the origin of these effects. Regardless of whether or not the multisensory pupil response is the result of multisensory integration in the superior colliculus, the finding that the pupil seems to reflects a summation of independent processes²¹ highlights pupillometry as an ideal tool to investigate the independent contributions of different sensory processes to the pupil’s response.