Intra-auditory integration between pitch and loudness in humans: Evidence of super-optimal integration at moderate uncertainty in auditory signals

When a person plays a musical instrument, sound is produced and the integrated frequency and intensity produced are perceived aurally. The central nervous system (CNS) receives defective afferent signals from auditory systems and delivers imperfect efferent signals to the motor system due to the noise in both systems. However, it is still little known about auditory-motor interactions for successful performance. Here, we investigated auditory-motor interactions as multi-sensory input and multi-motor output system. Subjects performed a constant force production task using four fingers in three different auditory feedback conditions, where either the frequency (F), intensity (I), or both frequency and intensity (FI) of an auditory tone changed with sum of finger forces. Four levels of uncertainty (high, moderate-high, moderate-low, and low) were conditioned by manipulating the feedback gain of the produced force. We observed performance enhancement under the FI condition compared to either F or I alone at moderate-high uncertainty. Interestingly, the performance enhancement was greater than the prediction of the Bayesian model, suggesting super-optimality. We also observed deteriorated synergistic multi-finger interactions as the level of uncertainty increased, suggesting that the CNS responded to increased uncertainty by changing control strategy of multi-finger actions.

been suggested that the CNS is capable of integrating different sources of sensory information to improve overall perception and improve motor outcomes 4,14,15 or decisions 16 . This is known as optimal integration or Bayesian integration, and the phenomenon has been observed in the integration of multiple sources not only between different sensory systems (i.e. inter-sensory integration) [17][18][19] , but also between different physical properties within the same sensory system (i.e. intra-sensory integration) 14,[20][21][22][23][24] .
In our previous work, we reported that the CNS could optimally integrate feedback on two physical properties of sound (i.e. frequency and intensity), consistent with the Bayesian model 4 . This intra-auditory integration seems to influence the multi-finger actions in a hierarchical manner. Previous studies have suggested that multi-finger actions are controlled in a hierarchical manner with two levels: individual finger (IF) actions at the lower level and virtual finger (VF) actions at the higher level 4,10,[25][26][27] . VF here is an imagined finger producing the same net mechanical effect as all fingers together. In our previous work, we demonstrated that enhancement of VF control (i.e. motor performance of the VF) could be achieved by improving synergistic IF actions (i.e. motor synergy) through integration of frequency and intensity of sound following the Bayesian model 4 . However, it remains unknown whether the Bayesian model can predict intra-auditory integration for different levels of uncertainty in auditory feedback. In addition to this, it is also unknown how the uncertainty affects multi-finger actions in a hierarchical organization.
Therefore, the current study investigated how the CNS deals with uncertainty manifested by the auditory feedback gains during constant multi-finger force production. Subjects were asked to produce a constant force using four fingers in three different auditory feedback conditions, where either the frequency (F), intensity (I), or frequency and intensity (FI) of an auditory tone changed depending on the deviation of the VF force from a target reference force. We hypothesized that 1) performance would be enhanced in the FI condition compared to the F condition or I condition alone for all uncertainty levels, following the Bayesian model, and 2) synergistic multi-finger action would deteriorate as the level of uncertainty increases, leading to reduced performance consistent with the findings from our previous work 4 .

Methods
Participants. Ten healthy right-handed male volunteers (mean age 24.5 years ± 1 year) participated in the study. The sample size was determined by power analysis for statistical analysis conducted in G-Power with an alpha of 0.05, power of 0.95, an effect size of 0.4 and 11 degrees of freedom 28 . Participants were free of neurological disorders, psychiatric disorders, speech-language disorders, hearing impairments, and motor impairments. In order to avoid a potential confounding factor that musical training can cause in terms of auditory-motor integration, participants who had musical training within 5 years were excluded. Participants provided written informed consent. All procedures were approved by the University of Maryland College Park Institutional Review Board. Experiments were carried out in accordance with approved guidelines. Experimental setup. Four finger pressing forces were collected using load cells (ATI Nano 17, ATI Industrial Automation, Apex, NC, US) at a sampling frequency of 1,000 Hz with data acquisition hardware (6024E, National Instruments Corporation, Austin, TX, US) using a custom program written with LabVIEW (LabVIEW 8.2, National Instruments Corporation, Austin, TX, US). This program interfaced with a function generator (Agilient 33522 A, Keysight Technologies, Inc., Santa Rosa, CA, US) to register the IF forces and calculate the VF force as the sum of IF forces. The program also generated auditory signals played through left and right ears of headphones worn by the subjects (AE2, Bose Corporation. Framingham, MA, US).
In order to minimize distortion of sound due to headphone frequency response characteristics 29 , the auditory signal was calibrated to produce a constant intensity across all frequencies. Calibration was performed in a soundproof room by manipulating frequency from 20 to 20,000 Hz in 1 Hz increments and normalizing intensity at each increment 4 .
Task procedures. Subjects sat on a chair, wore the headphones, and placed the tips of their right-hand fingers (index, middle, ring, and little) on the load cells (Fig. 1). The subjects were asked to use these fingers to produce a constant VF force of 20 N (~20% of a typical healthy participant's maximum voluntary force 4,30 ) over 20 s while they received auditory feedback tones of the reference force through the left ear and the VF force through the right ear. The tone for the reference force (i.e. the reference tone) had a frequency of 1000 Hz and intensity of 70 dB 4 . The tone for the VF force (i.e. the tracking tone) played through the right ear varied in three different experimental conditions: 1) Frequency condition (F): the frequency of the tracking tone was modulated with deviation of the subject's VF force from 20 N, while the intensity of the tracking tone was kept constant at 70 dB 31 . 2) Intensity condition (I): the intensity of the tracking tone was modulated with deviation of the subject's VF force from 20 N, while the frequency of the tracking tone was kept constant at 1000 Hz 32 . 3) Frequency and Intensity condition (FI): both frequency and intensity of the tracking tone was modulated with the subject's VF force.
In order to present different levels of uncertainty in the auditory feedback of the VF force, we manipulated the auditory feedback gain for each of feedback conditions (F, I, and FI). For the baseline condition, the gain for frequency and intensity conditions were set as 7 Hz/N and 0.7 dB/N, respectively, according to previous studies on Just Noticeable Differences 4,31,32 . Four gains were used for frequency modulation (300, 86, 24, and 7 Hz/N) and four gains for the intensity (7.5, 3, 1. From each 20-s trial, the 9-s window from 6 to 15 s, typically capturing the steadiest VF force, was extracted for analysis to avoid the initial force stabilization in the beginning and the premature cessation of force production at the end of each trial 8 . The order of conditions was balanced across subjects.
Bayesian model. Bayesian model has been a successful in interpreting mechanisms of multi-sensory integration both within a sensory system 4,[17][18][19] and between different sensory systems 14,[20][21][22][23][24] . This model can be useful for investigating the performance enhancement during a particular task where each sensory information provides the same state of physical property because the model can predict performance enhancement. Using the framework of the Bayesian model, the bimodal estimate, Ŝ FI , of a finger force from FI can be expressed as a weighted sum of variances from F and I, Ŝ F and Ŝ I , respectively; If the estimates are considered Gaussian random variables with mean µ and variance σ 2 , the optimal estimate is more precise (lower variance) than the uni-modal estimates as follows: The variance of combined estimate (σ FI 2 ) is lower than the variance from F (σ F 2 ) as well as the variance from I where f T is a reference force (20 N here)), is expressed by a weighted average of the F bias (b F ) and the I bias To test whether auditory modalities are optimally integrated according to the Bayesian model, we quantified motor performance in the form of the overall mean-squared error (OMSE), the averaged squared deviation of the VF force from the reference force: is the VF force at trial i, and τ is the duration of y t ( ) i . Then, we compared the experimentally obtained OMSE to the OMSE predicted by the Bayesian model, which is divided into variable error (σ FI 2 ) and systematic error (b FI 2 ) as follows: Hierarchical variability decomposition model. In our previous work, we developed a hierarchical variability decomposition (HVD) model to quantify the hierarchical organization of multi-finger actions in terms of the VF and IF forces (Fig. 2). The VF force for trial i, y t ( ) i , was modeled as the sum of three components 8 is the demeaned VF force for trial i, m is the mean VF force after averaging over all timesteps of all 5 trials, and E i is the difference between i th trial mean VF force and m. In this model, OMSE, the index of motor performance, was partitioned into three error components as different performance variables 8 : 1) The "online intra-trial variable error (VE ON ), " σ X 2 , calculated as the averaged variance of X t ( ) i 2) The "offline inter-trial variable error (VE OFF ), " σ E 2 , calculated as the variance of E i 3) The "systematic error (SE) or bias", b 2 , calculated as − m Note that the sum of online and offline variable errors (VE) is the variance of VF force (σ 2 ), and the systematic error is the squared bias of VF force ( − m (20 . The online and offline variable errors can be further defined as the sum of IF variances plus between-finger covariances: where x j is the demeaned IF force of the j th finger, and e j is the IF force difference of the j th finger between the means across time and across time steps in 5 trials, n is the number of task fingers (n = 4), and the overhead bars indicate means over trials. ), quantifies synergistic actions between finger forces to attenuate or amplify the VF force error 4,8 .
The indices of synergy quantified above are mathematically equivalent to the index of motor synergy calculated between effectors in the previous studies as the normalized variance difference between task-relevant space and task-irrelevant space initially introduced as the uncontrolled manifold (UCM) analysis 9,13,35 .
SCiENTifiC REPoRts | (2018) 8:13708 | DOI:10.1038/s41598-018-31792-w Statistical analysis. All dependent variables were transformed to correct for a non-normal distribution using the log transformation for OMSE, VE, SE, VE ON , VE OFF , Var ON , and Var OFF and log-modulus transformation methods 36 for Cov ON and Cov OFF , which allowed us to transform positive and negative values as follows: A two-way repeated measures ANOVA with factors Feedback (3 levels: F, I, and FI) and Uncertainty (4 levels: L, ML, MH, and H) were used to test the differences between conditions. The level of statistical significance was set at p = 0.05. A post-hoc test with Bonferroni correction was performed where necessary. We used Greenhouse-Geisser correction for violation of sphericity. Paired t-test with bootstrapping was performed to compare the experimentally obtained OMSE to the OMSE predicted by the Bayesian model.

Results
Comparison with the Bayesian model. The Bayesian model well predicts OMSE for all different levels of uncertainty. OMSE for FI did not differ from OMSE estimated from the Bayesian model at any uncertainty conditions (L; p = 0.213, ML; p = 0.625, MH; p = 0.204, and H; p = 0.418) (Fig. 3a), along with no significant differences in SE (L; p = 0.484, ML; p = 0.579, MH; p = 0.256, and H; p = 0.405) (Fig. 3c). However, VE for FI at MH uncertainty was significantly lower than the VE estimated by the Bayesian model (p = 0.037), while there was no significant difference at L (p = 0.212), ML (p = 0.921), or H uncertainty (p=0.424) (Fig. 3b). This result indicates that the CNS improves motor performance exceeding the Bayesian prediction (i.e. super-optimality) when the uncertainty level is moderate.

Effects of feedback on multi-finger actions in the hierarchical organization. Multi-finger actions
were analyzed at the VF and IF levels using the HVD model. It was found that overall performance quantified as OMSE was enhanced through intra-auditory integration. The enhancement was mainly through reduction of the variability in VF force. These results were supported by significant Feedback effects on OMSE (F 1.193,18 = 7.832; p = 0.015), VE (F 1.197,18 = 6.809; p = 0.021), and SE (F 2,18 = 3.596; p = 0.049). The pair-wise comparisons showed that both OMSE and VE from FI were significantly lower than those for both F and I conditions (OMSE: FI vs F; p = 0.001; FI vs I; p = 0.006, and VE: FI vs F; p = 0.001; FI vs I; p = 0.006), while SE from FI significantly differed from only that of the F condition (FI vs F; p = 0.003, and FI vs I; p = 0.607) (Fig. 4b and c). There were significant Feedback × Uncertainty interaction effects in OMSE (F 6,54 = 2.412; p = 0.039) and VE (F 6,54 = 2.375; p = 0.041),  At the VF level, the performance enhancement for both online and offline controls through intra-auditory integration was observed. This indicates that the CNS combines frequency and intensity of an auditory signal in order to provide more consistent actions within a single trial as well as over multiple trials. This result was supported by a significant Feedback effect (VE ON : F 2,18 = 6.172; p = 0.009, and VE OFF : F 2,18 = 4.353; p = 0.029). The pair-wise comparisons showed that both VE ON and VE OFF from FI significantly were lower than those of either F or I alone (VE ON : FI vs. F; p = 0.002, and FI vs. I; p = 0.04, and VE OFF : FI vs. F; p = 0.009, and FI vs. I; p = 0.013) ( Fig. 4d and e). There was no significant Feedback × Uncertainty interaction effect on either VE ON (F 6,54 = 0.866; p = 0.526) or VE OFF (F 6,54 = 1.886; p = 0.100).
At the IF level, intra-auditory integration positively affected synergistic actions only in offline control, but not in online control. This indicates that the CNS combines frequency and intensity of an auditory signal in order to provide more consistent actions over multiple trials, not within a single trial. These results were supported by a significant Feedback effect on Cov ON (F 2,18 = 5.158; p = 0.017), but no significant effect on Var ON (F 2,18 = 1.553; p = 0.239), Var OFF (F 2,18 = 1.070; p = 0.364), or Cov OFF (F 2,18 = 3.333; p = 0.059). There was a significant Feedback × Uncertainty interaction effect on Cov OFF (F 6,54 = 2.601; p = 0.027) but no significant interaction effect on Var ON (F 6,54 = 0.797; p = 0.576), Var OFF (F 6,54 = 0.739; p = 0.621), or Cov ON (F 6,54 = 0.902; p = 0.501). The pair-wise comparisons showed significant reduction of Cov OFF from FI compared to that of either F or I alone (FI vs. F; p = 0.048, and FI vs. I; p = 0.035) in the MH condition.
Effects of uncertainty on multi-finger actions in the hierarchical organization. As we expected, motor performance quantified as OMSE decreased as the uncertainty level increased, which was supported by the significant Uncertainty effect (F 3,27 = 64.074; p < 0.001). The pair-wise comparisons showed significant statistical differences between feedback conditions (L vs, MH: p < 0.001, ML vs, MH: p = 0.002, MH vs, H: p = 0.013) (Fig. 4a). Both VE and SE increased as uncertainty increased, which was supported by the significant Uncertainty effect (VE:  (Fig. 4b and c). In the HVD model, VE was further partitioned into online variable error (VE ON ) and offline variable error (VE OFF ) in the VF actions. Both VE ON and VE OFF decreased as uncertainty increased, which was supported by the significant Uncertainty effect (VE ON :  (Fig. 3e) (Fig. 3f-i) The pair-wise comparisons showed

Discussion
The aim of this study was to investigate the role of auditory feedback uncertainty manifested by auditory feedback gains during a constant multi-finger force production task in three different sound feedback conditions (F, I, and FI). Using the HVD model, multi-finger actions were hierarchically analyzed at the VF and IF levels. First, we expected that intra-auditory integration would occur as evidenced by decreased variable error (VE), according to the Bayesian model. However, at MH uncertainty, there was greater reduction in VE in the FI condition compared to the Bayesian prediction. Second, we expected that synergistic actions would decrease as uncertainty increased. Indeed, we found that the indexes of synergistic actions (i.e. Cov ON and Cov OFF ) decreased as uncertainty increased, but there were no changes in total variability (i.e. Var ON and Var OFF ).
The role of auditory uncertainty in intra-auditory integration. The Bayesian model has been used to investigate how the brain integrates multiple sources of sensory information 14,[17][18][19][20][21][22][23][24] . These previous studies have suggested that the CNS combines multiple sensory modalities to enhance the state estimate and minimize variability in performance of goal-directed motor tasks to generate "optimal" outcomes. According to the Bayesian model, it might not be possible to produce better outcomes that what is predicted from the model (i.e. "super-optimality). Super-optimal inter-sensory integration has been observed in previous studies on humans and animals 37,38 . Our study also found that the performance improvement through intra-sensory integration was similar to or better than the statistically optimal performance predicted by the Bayesian model. The enhancement of motor performance in our study exceeded the Bayesian prediction when uncertainty was moderately high (Fig. 3b). Although this finding warrants further investigation, one can logically speculate that uncertainty plays a critical role in intra-sensory integration.
The inverse effectiveness rule has also been used to interpret the effects of uncertainty on integration of multiple sensory sources [39][40][41][42] . This rule supports the idea that multi-modal feedback is effectively integrated when the uni-modal responses are relatively weak 39 . Greater neuronal responses have been found in multi-modal (visual + auditory) stimulus compared with uni-modal stimulus of a smaller intensity, suggesting  that multi-modal integration is inversely related to the intensity of its uni-modal stimulus [40][41][42] . However, the super-optimality observed in our study deviates from the inverse effectiveness rules because intra-auditory integration was most effective at the intermediate level of uncertainty in our study.
Two physical quantities (frequency and intensity) are the most salient features of sound that contribute to its perception as pitch and loudness. According to the auditory perception theories 43,44 , these two quantities can be independently perceived by the CNS. We perceive and identify the different frequency of sound through the neural response of hair cells in different locations of basilar membrane 45 . On the other hand, the loudness of sound is perceived by changing of firing rate in the auditory nerve (i.e. firing-rate theory) 45 . For example, when the sound is weak (low intensity), only a small region of the basilar membrane moves sufficiently to evoke spikes. For strong sound intensity, on the other hand, the membrane is displaced by a larger amount, causing evoking spikes even in neighboring nerve fibers. In our experimental design, we manipulated a rate of change in frequency and intensity by the finger force to provide different level of uncertainty in the task. Our main finding of super-optimality at the intermediate level of uncertainty implies neural responses in the auditory nerve that is better than a prediction by the Bayesian integration.
Online vs. offline controls. Online and offline motor behaviors infer distinct control mechanisms of the CNS control mechanisms in redundant motor systems 4,8,46 since the former is controlled continuously, and the latter is controlled discretely. We noted enhancement of the repeatability (i.e. offline control) of VF actions through intra-sensory integration, but no enhancement of consistency (i.e. online control). In our previous study 4 , we showed that intra-auditory integration had a greater influence on offline control than online control, consistent with other previous studies 14,47 . These previous studies investigated the integration between different senses (e.g. visual and auditory, visual and tactile) and showed subjects enhanced repeatability when using both senses during repetitive tasks such as estimating the position or size of a target. The results of the current study support the theory that the benefits of multi-sensory integration extend to intra-sensory integration as well as to state estimation and repetitive motor performance.
The role of auditory uncertainty in hierarchically organized multi-finger actions. Previous studies have shown that multi-finger actions are controlled in a hierarchical manner with at least two levels: individual finger actions at the lower level and virtual finger actions at the higher level 10,[25][26][27] . In the current study, we investigated the hierarchical organization of multi-finger actions using the HVD model that quantifies several aspects of motor performance at the VF level such as estimability (i.e. inverse of SE), consistency (i.e. inverse of VE ON ), and repeatability (i.e. inverse of VE OFF ) 4 . In a constant force production task, the estimability reflects the CNS's ability to estimate the target force and consistency reflects the CNS's ability to perform the task on a moment-to-moment basis (i.e. online control), while repeatability reflects the ability to repeat the same task goal on trial-to-trial basis (i.e. offline control). At the IF level, the consistency and repeatability at the VF level can be explained by the sum of variability in IF forces and the co-variability (i.e. motor synergy) among the IF forces. Variability (i.e. Var ON and Var OFF ) and co-variability (i.e. Cov ON and Cov OFF ) reflect the CNS's "work space" and "control strategy" to perform the task, respectively. Note that positive covariance indicates that the IF forces are co-varied to amplify the VF force and increase the performance error, while negative covariance attenuates the VF force and decreases performance error 8 . Thus, increasing and decreasing covariance reflects deterioration and enhancement of multi-finger synergy, respectively, in constant force production tasks by the VF.
As expected, we found that, at the VF level, estimability, consistency, and repeatability decreased (i.e. error variables increased) as auditory uncertainty increased. The result is consistent with the finding of previous studies that have shown that the uncertainty in visual feedback leads to performance errors during a constant finger force control task 46,48 . However, interestingly, at the IF level, the current study found that the total variance in IF forces remained unchanged for all different levels of uncertainty, while covariance between the IF forces increased from negative values as auditory uncertainty increased. According to the principle of non-individualized control 49 , multiple motor effectors (e.g. muscles, joints, or fingers) are not controlled individually, but are rather united as a task-specific organization. Indeed, in support of the principle, our results indicates that the CNS does not reduce the variability of individual finger forces, but rather changes synergistic patterns between finger forces to coordinate the IF actions in order to enhance performance of the VF actions (i.e. task-specific organization, commonly addressed as "synergies" in contemporary literature) 5,6 . Limitations. There are some limitations in the current study. First, the feedback gain for the baseline condition was set according to "just noticeable differences" (JND) previously reported assuming a change in force of 1 N 4,31,32 . The use of feedback gains through individual auditory sensitivity might have provided more accurate subject-specific conditions for the study. Second, although the frequency and intensity of sound might be their most evident physical features, these two quantities might not be independent of each other; previous studies on the anatomy of the auditory system suggest that one can influence the other by showing that the psychophysical transformation from frequency-intensity space to pitch-loudness space was not a homeomorphism 50,51 . The accurate quantifications of both the subject-specific gains and the homeomorphism demand new methods achieved through careful experimental design and modeling.