Tinnitus impairs segregation of competing speech in normal-hearing listeners

Liu, Yang Wenyi; Wang, Bing; Chen, Bing; Galvin, John J.; Fu, Qian-Jie

doi:10.1038/s41598-020-76942-1

Download PDF

Article
Open access
Published: 16 November 2020

Tinnitus impairs segregation of competing speech in normal-hearing listeners

Yang Wenyi Liu¹,
Bing Wang¹,
Bing Chen¹,
John J. Galvin III² &
…
Qian-Jie Fu³

Scientific Reports volume 10, Article number: 19851 (2020) Cite this article

2038 Accesses
13 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Many tinnitus patients report difficulties understanding speech in noise or competing talkers, despite having “normal” hearing in terms of audiometric thresholds. The interference caused by tinnitus is more likely central in origin. Release from informational masking (more central in origin) produced by competing speech may further illuminate central interference due to tinnitus. In the present study, masked speech understanding was measured in normal hearing listeners with or without tinnitus. Speech recognition thresholds were measured for target speech in the presence of multi-talker babble or competing speech. For competing speech, speech recognition thresholds were measured for different cue conditions (i.e., with and without target-masker sex differences and/or with and without spatial cues). The present data suggest that tinnitus negatively affected masked speech recognition even in individuals with no measurable hearing loss. Tinnitus severity appeared to especially limit listeners’ ability to segregate competing speech using talker sex differences. The data suggest that increased informational masking via lexical interference may tax tinnitus patients’ central auditory processing resources.

Poor early cortical differentiation of speech predicts perceptual difficulties of severely hearing-impaired listeners in multi-talker environments

Article Open access 09 April 2020

Brandon T. Paul, Mila Uzelac, … Andrew Dimitrijevic

Importance of ipsilateral residual hearing for spatial hearing by bimodal cochlear implant users

Article Open access 27 March 2023

Mathew Thomas, John J. Galvin III & Qian-Jie Fu

‘Normal’ hearing thresholds and fundamental auditory grouping processes predict difficulties with speech-in-noise perception

Article Open access 14 November 2019

Emma Holmes & Timothy D. Griffiths

Introduction

Tinnitus can be described as perception of sound (typically noise or ringing) that is unrelated to an external stimulus. The disrupted neural activity within the auditory system has been argued to be a primary cause for tinnitus sensation^1,2. Tinnitus of cochlear origin may affect cochlear amplification mechanisms due to deformities in cochlear function³. While tinnitus is often associated with hearing loss, tinnitus may also be present in individuals who exhibit normal audiometric thresholds⁴. The perceptual consequences of tinnitus in basic psychophysical measures and speech perception have been well documented in tinnitus patients with normal hearing (NH)^5,6,7,8.

Tinnitus may affect bottom-up processing at the periphery, resulting in impaired basic auditory discrimination abilities^{7,8,9,10,11,12,13,14,15}. Compared to NH listeners without tinnitus, some previous studies show perceptual deficits in NH listeners with tinnitus for a variety of basic auditory discrimination measures (e.g., gap detection, duration discrimination, frequency discrimination, low-frequency amplitude modulation (AM) detection, intensity discrimination)^{7,8,9,10,11,12,13,14}. However, other studies have shown no differences between tinnitus and non-tinnitus listeners for other auditory measures (e.g., gap detection, intensity discrimination, high-frequency AM detection, spectral ripple discrimination, Schroeder-phase discrimination)^15,16,17,18.

Recognition of masked speech not only requires sufficient auditory resolution for bottom-up processing, but also involves top-down (linguistic and contextual) processes. Most previous studies have shown that, compared to NH listeners without tinnitus, NH listeners with tinnitus exhibit poorer speech understanding in noise, regardless of the heterogeneity of the tinnitus population or the complexity of listening tasks^{5,6,7,18,19,20,21,22}. However, the deficit in speech performance may depend on the severity of tinnitus⁷, the ear with tinnitus¹⁸, and task difficulty^5,22. To varying degrees, steady noise, modulated maskers, and competing speech produce energetic, envelope, and/or informational masking^{23,24,25,26,27}. Steady noise largely produces energetic masking at the periphery (i.e., spectral overlap between the target and masker). Modulated noise and multi-talker babble may produce energetic as well as envelope masking in cochlear regions remote from the target, and may not depend on the degree of spectral overlap between the target and the masker²⁸. Competing speech may produce energetic, envelope, and informational masking (due to lexical interference, talker characteristics, etc.). Note that informational masking may occur even when there is no energetic masking, as with dichotic presentation of target and masker speech.

Segregation of competing speech may require more attention²⁹ as listeners may confuse competing speech signals with the signal of interest. If top-down processes are affected by tinnitus, the ability to use different cues to segregate the target speech from intelligible competing speech would be expected to be poorer in listeners with than without tinnitus. Faraji et al.³⁰ compared co-modulation release from masking between individuals with or without chronic tinnitus. While there was no significant difference in thresholds with the unmodulated masker between the tinnitus and non-tinnitus group, thresholds with the co-modulated masker were significantly higher for the tinnitus group, and co-modulation release from masking was significantly poorer for the tinnitus group. These results suggest that, compared to NH listeners without tinnitus, NH listeners with tinnitus may experience greater envelope interference and informational masking beyond the periphery.

With competing speech, the amount of informational masking has been shown to increase with the number of competing talkers, up to a certain threshold^25,31. Previous studies have shown that informational masking may occur with two competing talkers^32,33,34,35. For multi-talker babble, masking is reduced as the number of talkers increase beyond 2^33,34,35. Previous speech perception studies in tinnitus listeners have involved steady noise¹⁷, multi-talker babble^7,22,36 and 1 competing talker¹⁷. Kidd et al.³⁷ found large masking release (MR) due to the difference in talker sex cues and/or spatial cues, which is primarily driven by the reduction in the informational masking, especially with 2-talker competing maskers. However, it is unclear whether tinnitus will affect the use of these segregation cues (talker-sex and/or spatial cues) on MR and the effects of tinnitus may differ in maskers with primarily energetic or envelope masking (e.g., steady noise or multi-talker speech babble) and in maskers with primarily informational masking (e.g., 2-talker competing speech).

In the present study, speech recognition thresholds (SRTs) were measured in NH listeners with and without tinnitus. The “tinnitus” group was comprised of individuals with normal audiometric thresholds (< 20 dB HL) and tinnitus, and the “non-tinnitus” group was comprised of individuals with normal audiometric thresholds and no tinnitus. SRTs were measured in multi-talker babble (energetic and envelope masking) and in competing speech (largely informational masking). For SRTs in competing speech, the target and masker sex were the same or different, and target and maskers were co-located or spatially separated. Similar to previous studies, SRTs were expected to be poorer in the tinnitus group than in the non-tinnitus group. Due to central processing deficits associated with tinnitus, the tinnitus group was expected to exhibit less masking release with talker sex and/or spatial segregation cues. In the tinnitus group, tinnitus severity was measured using a visual analog scale³⁸ (VAS) and the Tinnitus Handicap Inventory³⁹ (THI); linear regression analyses were performed between tinnitus severity and SRTs.

Results

Figure 1 shows tinnitus VAS scores as a function of THI scores for the tinnitus group. For both measures, there was a wide variability in tinnitus severity, ranging from extremely mild to severe, with most participants exhibiting mild-to-moderate severity. Across all tinnitus participants, mean VAS scores were 3.7 ± 2.0 (range 1–7) and mean THI scores were 36.8 ± 19.6 (range 4–76). A student’s t-test showed no significant difference in tinnitus severity between participants who reported tinnitus in one ear (open circles; mean VAS = 3.7 ± 2.0; mean THI = 34.0 ± 18.9) or in both ears (filled circles; mean VAS = 4.0 ± 2.2; mean THI = 41.0 ± 24.1). Note that the lack of difference might also be due to the small number of participants. Linear regression analysis showed that VAS and THI scores were highly correlated (r² = 0.933; p = 0.000006). Self-reported duration of tinnitus was also significantly correlated with VAS (r² = 0.714; p = 0.002) and THI scores (r² = 0.539; p = 0.008).

Table 1 lists mean SRTs, median SRTs and other analytics for MSP sentences in babble with the normal and fast speaking rates in the tinnitus and non-tinnitus groups. Figure 2 shows boxplots of SRTs for MSP sentences in babble with the normal and fast speaking rates in the tinnitus and non-tinnitus groups. With the normal-speaking rate, mean SRTs were − 9.47 ± 0.51 dB and − 10.32 ± 1.19 dB in the tinnitus and non-tinnitus group, respectively. With the fast speaking rate, mean SRTs were − 8.25 ± 0.52 dB and − 9.29 ± 0.63 dB in the tinnitus and non-tinnitus group, respectively. The mean difference in SRTs between the normal and fast speaking rate was comparable across groups (1.2 and 1.0 dB for the tinnitus and non-tinnitus group, respectively). The mean difference in SRTs between the tinnitus and non-tinnitus group was also comparable across speaking rates (0.9 and 1.0 dB for the normal and fast speaking rate, respectively). A Mann–Whitney non-parametric test was used to compare SRTs between subject groups (across speaking rates). Results showed that SRTs were significantly lower (better) in the non-tinnitus group than in the tinnitus group (U = 89.0, p = 0.003). A non-parametric Kruskal–Wallis one-way analysis of variance (ANOVA) was performed on ranked data, and post-hoc Tukey pairwise comparisons were performed between the normal and fast speaking rates within and across subject groups. Within the tinnitus group, SRTs were significantly lower with the normal rate than with the fast rate (p = 0.020). Within the non-tinnitus group, there was no significant difference between the normal and fast rates (p = 0.408). Within the normal rate, there was no significant difference between the non-tinnitus and tinnitus groups (p = 0.466). Within the fast rate, SRTs were significantly lower in the non-tinnitus group than in the tinnitus group (p = 0.027).

Table 1 Mean SRTs, standard deviation (Std. Dev.), confidence interval (C.I.) of the mean, maximum (max) SRT, minimum (min) SRT, median SRT, 25th and 75th percentiles for MSP sentence recognition in multi-talker babble with the normal and fast speaking rates, and CRM keyword recognition with different cue conditions, in the tinnitus and non-tinnitus groups.

Full size table

Table 1 lists mean SRTs, median SRTs and other analytics for CMS sentences in competing speech for 4 cue conditions (No talker sex/no spatial, Talker sex, Spatial, Talker sex + spatial) in the tinnitus and non-tinnitus groups. Figure 3 shows boxplots of SRTs in competing speech for the 4 cue conditions in the non-tinnitus and tinnitus groups; note that participants in the non-tinnitus group were different from those in Fig. 2. In the non-tinnitus group, mean SRTs were − 0.73 ± 0.72, − 11.51 ± 1.95, − 13.41 ± 1.39, and − 17.30 ± 1.44 dB for the No talker sex/no spatial, Talker sex, Spatial, Talker sex + spatial cue conditions, respectively. In the tinnitus group, mean SRTs were 0.96 ± 0.37, − 5.09 ± 1.52, − 5.44 ± 0.82, and − 12.91 ± 1.82 dB for the No talker sex/no spatial, Talker sex, Spatial, Talker sex + spatial cue conditions, respectively. A Mann–Whitney non-parametric test was used to compare SRTs between subject groups (across cue conditions). Results showed that SRTs were significantly lower (better) in the non-tinnitus group than in the tinnitus group (U = 414.5, p < 0.001). A non-parametric Kruskal–Wallis one-way ANOVA was performed on ranked data, and post-hoc Tukey pairwise comparisons were performed among the cue conditions within subject groups. Within the tinnitus group, SRTs were significantly higher (poorer) in the No Sex/no spatial cue condition than in the Talker sex (p = 0.032), Spatial (p = 0.014), or Talker sex + spatial conditions (p < 0.001). SRTs were significantly lower (better) in the Talker sex + spatial condition than in the Talker sex (p = 0.014) or Spatial conditions (p = 0.032); there was no significant difference between the Talker sex and Spatial conditions (p = 0.993). Within the non-tinnitus group, SRTs were significantly higher (poorer) in the No Sex/no spatial cue condition than in the Talker sex (p = 0.002), Spatial (p < 0.001), or Talker sex + spatial conditions (p < 0.001). SRTs were significantly lower (better) in the Talker sex + spatial condition than in the Talker sex condition (p = 0.004), but not in the Spatial condition (p = 0.137); there was no significant difference between the Talker sex and Spatial conditions (p = 0.605).

Masking release (MR) was calculated for the Talker sex, Spatial, and Talker sex + spatial cue conditions, relative to the No talker sex/no spatial condition. Figure 4 shows boxplots of MR for the Talker sex, Spatial, and Talker sex + spatial cue conditions. In the non-tinnitus group, mean MR was 10.78 ± 1.61, 12.68 ± 0.83, and 16.57 ± 1.29 dB for the Talker sex, Spatial, and Talker sex + spatial cue conditions, respectively. In the tinnitus group, mean MR was 6.04 ± 1.37, 6.39 ± 0.78, and 13.86 ± 1.59 dB for the Talker sex, Spatial, and Talker sex + spatial cue conditions, respectively. A Mann–Whitney non-parametric test was used to compare MR between groups (across cue conditions). Results showed that MR was significantly larger in the non-tinnitus group than in the tinnitus group (U = 178.0, p < 0.001). A non-parametric Kruskal–Wallis one-way ANOVA was performed on ranked data, and post-hoc Tukey pairwise comparisons were performed among the cue conditions within subject groups. Within the tinnitus group, MR was significantly larger for the Talker Sex + spatial cue condition than for the Talker Sex (p = 0.001) or Spatial conditions (p = 0.005); there was no significant difference between the Talker Sex and Spatial conditions (p = 0.896). Within the non-tinnitus group, MR was significantly larger for the Talker Sex + spatial cue condition than for the Talker Sex (p < 0.001) or Spatial conditions (p = 0.037); there was no significant difference between the Talker Sex and Spatial conditions (p = 0.173).

Linear regression analyses were performed between tinnitus severity and SRTs in the tinnitus group. Because VAS and THI scores were highly correlated (see Fig. 1), VAS and THI data were reduced to a single “tinnitus severity” factor using dimensionality reduction. For the MSP sentences in babble, there was no significant correlation between tinnitus severity and SRTs with the normal (r² = 0.07; p = 0.446) or fast speaking rate (r² = 0.13; p = 0.316). Figure 5 shows SRTs for competing speech as a function of tinnitus severity for the different cue conditions. No significant correlations were observed between tinnitus severity and SRTs in the No talker sex/no spatial (panel A; r² = 0.35; p = 0.073) or Spatial cue conditions (panel C; r² = 0.12; p = 0.338). Significant correlations were observed between tinnitus severity and SRTs in the Talker sex (panel B; r² = 0.80; p < 0.001) and Talker sex + spatial cue conditions (panel D; r² = 0.65; p = 0.005); these correlations remained significant after Bonferroni correction for multiple comparisons (adjusted p = 0.0125). Tinnitus severity was significantly correlated with MR for the Talker sex (r² = 0.69, p = 0.003) and Talker sex + spatial cue conditions (r² = 0.62, p = 0.007), but not for the Spatial cue condition (r² = 0.01, p = 0.823).

Discussion

For all speech measures, performance was significantly poorer in the tinnitus group than in the non-tinnitus group, suggesting a general deficit regardless of test materials, masker type, or listening task, consistent with previous studies^{5,6,7,18,19,20,21,22,36}. The tinnitus group was also less able to use talker sex differences and/or spatial cues to segregate competing speech, and obtained significantly less MR than did the non-tinnitus group. For the tinnitus group, tinnitus severity was highly correlated with listeners’ ability to segregate speech according to talker sex differences.

With multi-talker babble, SRTs were significantly poorer in the tinnitus group than in the non-tinnitus group; note that the mean difference in SRTs between groups was small (approximately 1.0 dB across both speaking rates). While there was no significant difference in SRTs across speaking rates in the non-tinnitus group, SRTs in the tinnitus group were significantly higher (poorer) with the fast rate than with the normal rate (mean difference = 1.2 dB). This suggests that tinnitus may have negatively affected performance as the listening demands increased. This finding is consistent with Huang et al.¹⁹, who measured SRTs in steady noise in NH listeners with or without tinnitus using the Mandarin Speech in Noise (MSPIN) test with high- or low-predictability sentences. They found that, for high-predictability sentences, the mean performance deficit in tinnitus group was 3.4 percentage points, relative to the non-tinnitus group. For low-predictability sentences, the mean performance deficit in tinnitus group was 12.4 percentage points, relative to the non-tinnitus group. Taken together, increased listening difficulty, whether due to lexical information (as in Huang et al.¹⁹) or to speaking rate (as in the present study) may better illuminate differences between tinnitus and non-tinnitus groups.

Significant and substantial deficits in SRTs in competing speech were observed in the tinnitus group, relative to the non-tinnitus group. For the No talker sex/no spatial cue condition, the mean deficit in SRTs was 1.7 dB for the tinnitus group, relative to the non-tinnitus group. However, the mean deficit in SRTs greatly increased in the tinnitus group for the Talker sex (6.4 dB), Spatial (8.0 dB), and Talker sex + spatial cue conditions (4.4 dB), relative to the non-tinnitus group. It is possible that the reduced deficit in the tinnitus group for the Talker sex + spatial condition may have been due to ceiling performance effects in the non-tinnitus group, where the mean SRT was − 17.3 dB. Different from the pattern of results for SRTs in babble, SRTs in competing speech were most similar between the tinnitus and non-tinnitus groups for the most challenging listening condition (No talker sex/no spatial), and diverged across groups as segregation cues became available. As such, the tinnitus group was less able to utilize segregation cues than was the non-tinnitus group. For competing speech with no talker sex or spatial cues, the degree of masking may have been sufficiently high to obscure differences between the tinnitus and non-tinnitus groups.

Although SRTs with competing speech were significantly lower (better) for the non-tinnitus group than for the tinnitus group, utilization of segregation cues was similar across groups (Fig. 3). In both groups, SRTs were significantly lower for the Talker sex, Spatial, and Talker sex + spatial cue conditions, relative to the No talker sex/no spatial condition. In both groups, there was no significant difference between the very different Talker sex and Spatial cue conditions. While the tinnitus and non-tinnitus groups may have similarly utilized segregation cues, utilization efficiency was significantly poorer in the tinnitus group than in the non-tinnitus group. This is reflected by the significantly greater MR in the non-tinnitus group than in the tinnitus group (Fig. 4).

The present findings are not consistent with Zeng et al.¹⁷, who found no significant difference in utilization of talker sex cues to segregate competing speech between tinnitus and non-tinnitus listeners. Differences in test materials and methods may partly explain discrepancies between studies. A closed-set CRM task using Mandarin matrix-styled sentences was used in the present study, while an open-set sentence recognition task using low-predictability English sentences was used in Zeng et al.¹⁷. Two competing speech maskers were used in present study, while three different masker conditions were tested in Zeng et al.¹⁷ (i.e., steady noise, 1 competing female talker, or 1 competing male talker). Previous studies have shown that the amount of informational masking increases as the number of competing talkers increases from 1 to 2^24,25. Masker sentences were randomly generated for each test trial in the present study; a single masker sentence appears to have been used for the competing female or male masker in Zeng et al.¹⁷, which might have allowed some entrainment to the masker sentence across test trials and runs, effectively reducing informational masking. Taken together, the present study methods and materials may have generally increased informational masking, allowing for better differentiation between tinnitus and non-tinnitus listeners than observed in Zeng et al.¹⁷.

The present data are in general agreement with most previous studies that show masked speech perception is poorer in individuals with tinnitus than those without tinnitus ^{5,6,7,18,19,20,21,22,36}. Relatively few studies have reported correlations between tinnitus severity and masked speech perception. In the present study, tinnitus severity was not significantly correlated with MSP sentence recognition in multi-talker babble, with the normal or fast speaking rate. This is consistent with Jain and Sahoo⁷, who showed no significant correlation between tinnitus severity and speech understanding in 4-talker speech babble. The present data also showed no significant correlation between tinnitus severity and SRTs with two co-located male talkers (Fig. 5A) or two spatially separated male-talkers (Fig. 5C). However, there was a significant correlation between tinnitus severity and SRTs with two co-located female talkers (Fig. 5B) or two spatially separated female talkers (Fig. 5D). According to Kidd et al.³⁷, talker sex differences allow for a release from informational masking. The addition of spatial cues may enhance talker sex cues, allowing for greater release from informational masking^40,41. It is unclear why tinnitus severity was unrelated to utilization of spatial cues to segregate competing speech in the present study. It is possible that head shadow effects and improved TMR at each ear (relative to co-located speech and maskers) may not be influenced by tinnitus, as these more represent the physical aspects of the target and maskers.

Besides the present correlation between tinnitus severity and segregation of competing speech according to talker sex differences, tinnitus severity has been negatively correlated with other measures, such as the ability to control the emotional response to tinnitus⁴² and quality of life⁴³. Thompson et al.⁴³ also found that higher levels of physical activity (e.g., exercise) were correlated with lower tinnitus severity scores, possibly due to stress reduction. Using functional magnetic resonance imaging (fMRI), Carpenter-Thompson et al.⁴⁴ found that tinnitus severity was associated with activation of frontal areas, with lower severity associated with greater frontal activation. The authors suggested that individuals with lower tinnitus severity may better utilize frontal regions to better control their emotional response to the affective sounds. Voice pitch is the most important cue for voice emotion recognition and voice gender discrimination⁴⁵. It is possible that individuals with lower tinnitus severity may better utilize frontal regions to segregate competing speech according to talker sex differences.

Deficits in central auditory processing and/or cognitive function may significantly affect masked speech understanding^32,46,47. Ivansic et al.²¹ suggested that difficulties in understanding speech in noise in tinnitus patients may be due to deficits in central processing and/or attention, and further suggested that more complex listening tasks may better reveal central difficulties for tinnitus patients. Tegg-Quinn et al.⁴⁸ suggested that tinnitus patients may have difficulties in allocating attention resources, which may affect cognitive processes needed to segregate speech from noise or from competing speech. Overall, the present data highlight the potential role of central processing deficits on speech performance in tinnitus patients, compared to non-tinnitus listeners. Tinnitus severity was not correlated with SRTs with speech babble, but was significantly correlated with SRTs for competing speech when the target and masker sex was different, and when talker sex differences were combined with spatial cues. However, MR due to talker sex differences (i.e., release from informational masking, which is more central in origin) was smaller in the tinnitus group than in the non-tinnitus group (Fig. 4). The present data suggest that segregation of target and multiple masker sentences (largely, release from informational masking) may better reveal central processing difficulties associated with tinnitus than when measuring speech understanding in multi-talker babble (which may more represent release from envelope masking or from some combination of energetic and informational masking).

One potential limitation of the present study is the relatively small number of participants in the tinnitus group. The mean performance difference was often quite large between tinnitus and non-tinnitus group, especially for segregation of competing speech (Fig. 3). Age effects on speech performance were somewhat controlled by including individuals with similar ranges across the listener groups; also, the maximum age was 45 years, which may be sufficiently low to avoid large age effects. However, the characteristics and distribution of tinnitus may be heterogeneous. Testing with a large group of participants with various characteristics of tinnitus may result in a different pattern of results. The limited data from the present study showed that there was no significant difference in performance between participants with bilateral or unilateral tinnitus. However, etiology, duration, and sound quality of tinnitus may also affect performance in tinnitus patients. The present number of tinnitus participants was too small to explore the potential roles of these factors on understanding of masked speech. Further studies with more participants may provide additional information about underlying mechanisms of tinnitus that may limit understanding of masked speech.

Another potential limitation was the use of a non-individualized head related transfer function (HRTF) to measure segregation of competing speech. Using an HRTF that is not listener-specific may result in an unrealistic perception of space or insufficient externalization of sound, which may especially affect SRTs with spatial cues. In previous related studies, segregation of competing speech was measured using a non-individualized HRTF³² or loudspeakers⁴⁹ using similar methods and the same cue conditions used in the present study. In Zhang et al.³², mean SRTs with the non-individualized HRTF were 0.36 ± 1.95, − 8.31 ± 1.81, − 11.87 ± 2.79, − 12.62 ± 2.69 for the No talker sex/no spatial, Talker sex, Spatial, Talker sex + spatial cue conditions, respectively. In Willis et al.⁴⁹, mean SRTs with loudspeakers were 1.23 ± 1.28, − 7.77 ± 2.45, − 11.65 ± 3.10, − 12.01 ± 2.54 dB for the No talker sex/no spatial, Talker sex, Spatial, Talker sex + spatial cue conditions, respectively. SRTs data were comparable across these studies, despite difference in sound presentation. These data suggest that while using a non-individualized HRTF in headphones may not be ideal, that pattern of results were similar as with using real loudspeakers. As such, using a non-individualized HRTF was not likely to be a limiting factor for perception of competing speech in the present study.

Materials and methods

In compliance with ethical standards for human subjects, written informed consent was obtained from all participants or their legal guardians before proceeding with any of the study procedures. The study and its consent procedure were approved by the local ethics committee (Ethics Committee of Eye and Ear, Nose, Throat Hospital of Fudan University; approval number: KY2012-009). All research was performed in accordance with relevant guidelines and regulations.

Participants

Thirty Mandarin-speaking Chinese NH listeners were recruited for this study. Ten were clinically diagnosed as having tinnitus (6 males and 4 females; mean age at testing = 27.9 ± 9.1 yrs, range = 15–43 yrs), and 20 had no diagnosis or self-report of tinnitus (10 males and 10 females; mean age at testing = 27.8 ± 6.5 yrs, range = 22–45 yrs). All participants had pure tone thresholds < 25 dB HL at all audiometric frequencies between 250 and 8000 Hz measured using pure-tone audiometry administered in a sound treated room. Within the tinnitus group, tinnitus originated from the right ear in 1 participant, from the left ear in 3 participants, and from both ears in 6 participants. Table 2 shows the demographic information for the 10 tinnitus participants.

Table 2 Demographic information for the tinnitus group.

Full size table

Tinnitus severity measures

Tinnitus severity was measured using a visual analog scale³⁸ (VAS) and the Tinnitus Handicap Inventory³⁹ (THI). For VAS measures, participants were asked to indicate their current tinnitus severity by marking the number (0–10) on a 10-cm line (which included 1-cm ticks and corresponding 0–10 number) that was anchored with the extreme labels “No tinnitus at all” and “Worst tinnitus imaginable.” The THI is a validated subjective, self-reported rating of the impact of tinnitus on patients’ everyday life. The THI contains 25 questions; listeners respond with Yes (4 points), Sometimes (2 points, or No (0 points), resulting in a possible maximum score of 100. A THI score of 0–16 indicates "no or slight handicap", 18–36 indicates "mild" handicap, 38–56 indicates "moderate" handicap, 58–76 indicates "severe" handicap, and a score of 78–100 indicates "catastrophic handicap".

Test stimuli and procedure

SRTs for sentence recognition in six-talker babble was measured using the Mandarin Speech Perception (MSP) test materials⁵⁰, which consists of 5 lists of 20 sentences each. Each sentence contains 7 monosyllabic words, resulting in a total of 140 monosyllabic words for each list. The MSP materials consist of high-quality digital recordings of speech produced by a female Mandarin talker at two speaking rates (normal: 3.7 words per second; fast: 5.7 words per second). SRTs were measured using an open-set test paradigm and an adaptive procedure that converged on target-to-masker ratio (TMR) that produced 50% correct word-in-sentence recognition. One of five MSP lists was randomly selected for testing at each speaking rate. Stimuli were delivered via Sennheiser HDA300 headphones connected to the headphone output of a clinical audiometer, which was connected to an external audio device (Edirol UA-25EX). The audio device was connected to a Windows 10 computer via USB. Participants were seated in a sound-treated audio booth during testing. The target and masker were presented diotically (i.e., from both the right and left channels). The target sentence was always presented at 65 dBA (calibrated via clinical audiometer), and the masker level was adjusted according to the correctness of response. Custom software (iSTAR™; http://istar.emilyfufoundation.org) was used to administer the test, calculate the TMR during testing, and calculate the SRT at the end of the test run. During each test trial, the TMR was calculated according to the long-term root mean square (RMS) amplitude of the target sentence and the masker. During testing, a sentence was randomly selected from the list. Participants were instructed to repeat the sentence as accurately as possible, and to guess if they were unsure. The experimenter clicked on the correctly identified words, and the software calculated the percent of words correctly identified. If the listener repeated 50% or more words correctly, the TMR was reduced by 2 dB; if not, the TMR was increased by 2 dB. A reversal occurred when the change in the TMR switched from decreasing to increasing or vice versa. Each test run (20 trials) typically had 6 to 10 reversals in TMR. The SRT for each test run was calculated by averaging the TMR across the last 6 reversals.

The Closed-set Mandarin Speech (CMS) corpus^51,52 was used to measure segregation of competing speech. The CMS corpus consists of matrix-styled test materials which can be used to randomly create five-word sentences with same grammatical structure: name, verb, number, color, and object, each of which contain 10 words. The CMS materials were used to generate target and masker sentences. The target sentences were produced by a male talker; the mean fundamental frequency (F0) across all words was 139 Hz. Masker sentences were produced by 2 males (mean F0s = 143 Hz and 178 Hz) or 2 females (mean F0s = 208 Hz and 248 Hz). A coordinate response matrix (CRM) test paradigm was used, similar to previous studies^24,47,49,52. Listeners were asked to identify keywords from the Number and Color categories that were embedded in the randomly generated five-word sentences. Here, SRTs were defined as the TMR that produced 100% correct keyword recognition, consistent with the CRM test paradigm. The first word in the target sentence was always the Name “Xiaowang,” followed by randomly selected words from the remaining four categories. Two masker sentences produced by the male or female talkers were randomly generated using words not contained in the target sentence and different across masker sentences. An example target sentence could be “Xiaowang sold Three Red strawberries,” while the masker sentences could be the combination “Xiaozhang saw Two Blue kumquats” and “Xiaodeng took Eight Green papayas.” Target and masker sentences were individually processed using the HRTF from Willis et al.⁴⁹ to simulate various spatial locations. SRTs were measured for 4 cue conditions: (1) No talker sex/no spatial (male target and male maskers, all originating from 0°), (2) Talker sex (male target and female maskers, all originating from 0°), (3) Spatial (male target originating from 0°, male maskers originating from 90° and 270°), and (4) Talker sex + spatial (male target originating from 0°, female maskers originating from 90° and 270°). Stimuli were delivered via Sennheiser HDA300 headphones connected to the headphone output of clinical audiometer, which was connected to an external audio device (Edirol UA-25EX). The audio device was connected to a Windows 10 computer via USB. The target sentence was always presented at 65 dBA. Participants were seated in a sound-treated audio booth during testing. The presentation level of the masker sentences was adjusted according to the TMR in each trial. For example, for a 10 dB TMR, the masker sentences were presented at 55 dBA. If the listener correctly identified both keywords, the TMR was reduced by 2 dB; if not, the TMR was increased by 2 dB. The SRT was calculated by averaging the last 6 reversals in TMR. Three test runs were completed for each listening condition and the SRT was averaged across runs. The four conditions were randomized within and across participants. Custom software (Angel Sound™; http://angelsound.emilyfufoundation.org) was used to calculate the TMR and SRT during testing.

Data analysis

Speech recognition in multi-talker babble and competing talkers was analyzed in terms of SRT. Masking release (MR) was calculated for the Talker sex, Spatial, and Talker sex + spatial cue conditions, relative to the No talker sex/no spatial condition. SRT data were analyzed using non-parametric tests (Mann–Whitney test to compare listener groups; Kruskal–Wallis one-way ANOVAs on ranked data with post-hoc Tukey pairwise comparisons to compare test conditions). VAS and THI data were reduced to a single “tinnitus severity” factor using dimensionality reduction. Linear regression analyses were performed to determine relationships between tinnitus severity and SRTs. Analyses were performed using Systat software (v. 14) or SPSS (v. 22). For most analyses, the significance level was p = 0.05; for pairwise comparisons, the significance level was adjusted to control for multiple comparisons using Tukey or Bonferroni corrections.

Data availability

The data used for the current study are provided as supplementary material.

References

Eggermont, J. J. & Roberts, L. E. The neuroscience of tinnitus. Trends Neurosci. 27, 676–682 (2004).
Article CAS PubMed Google Scholar
Bartels, H., Staal, M. J. & Albers, F. W. Tinnitus and neural plasticity of the brain. Otol Neurotol. 28, 178–184 (2007).
Article PubMed Google Scholar
Oxenham, A. J. & Bacon, S. P. Cochlear compression: perceptual measures and implications for normal and impaired hearing. Ear Hear. 24, 352–366 (2003).
Article PubMed Google Scholar
Schaette, R. & McAlpine, D. Tinnitus with a normal audiogram: physiological evidence for hidden hearing loss and computational model. J. Neurosci. 31, 13452–13457 (2011).
Article CAS PubMed PubMed Central Google Scholar
Hennig, T. R., Costa, M. J., Urnau, D., Becker, K. T. & Schuster, L. C. Recognition of speech of normal-hearing individuals with tinnitus and hyperacusis. Int. Arch. Otorhinolaryngol. 15, 21–28 (2011).
Google Scholar
Ryu, I. S., Ahn, J. H., Lim, H. W., Joo, K. Y. & Chung, J. W. Evaluation of masking effects on speech perception in patients with unilateral chronic tinnitus using the hearing in noise test. Otol. Neurotol. 33, 1472–1476 (2012).
Article PubMed Google Scholar
Jain, C. & Sahoo, J. P. The effect of tinnitus on some psychoacoustical abilities in individuals with normal hearing sensitivity. Int. Tinnitus J. 19, 28–35 (2014).
Article PubMed Google Scholar
Ravirose, U., Thanikaiarasu, P. & Prabhu, P. Evaluation of differential sensitivity for frequency, intensity, and duration around the tinnitus frequency in adults with tonal tinnitus. J. Int. Adv. Otol. 15, 253–256 (2019).
Article PubMed PubMed Central Google Scholar
Sanches, S. G., Sanchez, T. G. & Carvallo, R. M. Influence of cochlear function on auditory temporal resolution in tinnitus patients. Audiol. Neurootol. 15, 273–281 (2010).
Article PubMed Google Scholar
Sanches, S. G., Samelli, A. G., Nishiyama, A. K., Sanchez, T. G. & Carvallo, R. M. GIN Test (Gaps-in-Noise) in normal listeners with and without tinnitus. Pro Fono 22, 257–262 (2010).
Article PubMed Google Scholar
Haas, R., Smurzynski, J. & Fagelson, M. The effect of tinnitus on gap detection. Tinnitus Today 37, 10–11 (2012).
Google Scholar
Gilani, V. M. et al. Temporal processing evaluation in tinnitus patients: Results on analysis of gap in noise and duration pattern test. Iran. J. Otorhinolaryngol. 25, 221–225 (2013).
Google Scholar
Fournier, P. & Hébert, S. Gap detection deficits in humans with tinnitus as assessed with the acoustic startle paradigm: does tinnitus fill in the gap?. Hear Res. 295, 16–23 (2013).
Article PubMed Google Scholar
Kwon, H. E., Kim, C. W. & Lee, J. H. The effects of tinnitus and hearing loss on results of GIN (gaps-in-noise) and the tinnitus-related subjective handicap. Korean Acad. Audiol. 11, 108–119 (2015).
Google Scholar
Paul, B. T., Bruce, I. C. & Roberts, L. E. Evidence that hidden hearing loss underlies amplitude modulation encoding deficits in individuals with and without tinnitus. Hear Res. 344, 170–182 (2017).
Article PubMed Google Scholar
Acrani, I. O. & Pereira, L. D. Temporal resolution and selective attention of individuals with tinnitus. Pro Fono 22, 233–238 (2010).
Article PubMed Google Scholar
Zeng, F. G., Richardson, M. L. & Turner, K. Tinnitus does not interfere with auditory and speech perception. J. Neurosci. 40, 6007–6017 (2020).
Article PubMed PubMed Central Google Scholar
Moon, I. J. et al. Influence of tinnitus on auditory spectral and temporal resolution and speech perception in tinnitus patients. J. Neurosci. 35, 14260–14269 (2015).
Article CAS PubMed PubMed Central Google Scholar
Huang, C. Y. et al. Relationships among speech perception, self-rated tinnitus loudness and disability in tinnitus patients with normal pure-tone thresholds of hearing. J. Otorhinolaryngol. Relat. Spec. 69, 25–29 (2007).
Article Google Scholar
Gilles, A. et al. Decreased speech-in-noise understanding in young adults with tinnitus. Front. Neurosci. 10, 1–14 (2016).
Article Google Scholar
Ivansic, D. et al. Impairments of speech comprehension in patients with tinnitus—a review. Front. Aging Neurosci. 9, 224 (2017).
Article PubMed PubMed Central Google Scholar
Tai, Y. & Husain, F. T. Right-ear advantage for speech-in-noise recognition in patients with nonlateralized tinnitus and normal hearing sensitivity. J. Assoc. Res. Otolaryngol. 19, 211–221 (2018).
Article PubMed Google Scholar
Brungart, D. S. Informational and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Am. 109, 1101–1109 (2001).
Article ADS CAS PubMed Google Scholar
Brungart, D. S., Simpson, B. D., Ericson, M. A. & Scott, K. R. Informational and energetic masking effects in the perception of multiple simultaneous talkers. J. Acoust. Soc. Am. 110, 2527–2538 (2001).
Article ADS CAS PubMed Google Scholar
Durlach, N. I. et al. Note on informational masking. J Acoust Soc Am. 113, 2984–2987 (2003).
Article ADS PubMed Google Scholar
Stone, M. A. & Canavan, S. The near non-existence of “pure” energetic masking release for speech: extension to spectro-temporal modulation and glimpsing. J. Acoust. Soc. Am. 140, 832 (2016).
Article ADS PubMed Google Scholar
Stone, M. A. & Moore, B. C. On the near non-existence of “pure” energetic masking release for speech. J. Acoust. Soc. Am. 135, 1967–1977 (2014).
Article ADS PubMed Google Scholar
Jennings, S. G. & Chen, J. Masking of short tones in noise: evidence for envelope-based, rather than energy-based detection. J. Acoust. Soc. Am. 148, 211–221 (2020).
Article ADS PubMed PubMed Central Google Scholar
Petersen, S. E. & Posner, M. I. The attention system of the human brain: 20 years after. Annu. Rev. Neurosci. 35, 73–89 (2012).
Article CAS PubMed PubMed Central Google Scholar
Faraji, L., Pourbakht, A. & Haghani, H. The comparison of the comodulation masking release (CMR) in individuals with and without chronic tinnitus. Neurosci. Lett. 704, 195–200 (2019).
Article CAS PubMed Google Scholar
Simpson, S. A. & Cooke, M. Consonant identification in N-talker babble is a nonmonotonic function of N. J. Acoust. Soc. Am. 118, 2775–2778 (2005).
Article ADS PubMed Google Scholar
Dryden, A., Allen, H. A., Henshaw, H. & Heinrich, A. The association between cognitive performance and speech-in-noise perception for adult listeners: a systematic literature review and meta-analysis. Trends Hear. 21, 2331216517744675 (2017).
PubMed PubMed Central Google Scholar
Freyman, R. L., Balakrishnan, U. & Helfer, K. S. Effect of number of masking talkers and auditory priming on informational masking in speech recognition. J. Acoust. Soc. Am. 115, 2246 (2004).
Article ADS PubMed Google Scholar
Chen, B. et al. Masking effects in the perception of multiple simultaneous talkers in normal-hearing and cochlear implant listeners. Trends Hear. 24, 2331216520916106 (2020).
PubMed PubMed Central Google Scholar
Yost, W. A. Spatial release from masking based on binaural processing for up to six maskers. J. Acoust. Soc. Am. 141(3), 2093–2106 (2017).
Article ADS PubMed PubMed Central Google Scholar
Bureš, Z. et al. Speech comprehension and its relation to other auditory parameters in elderly patients with tinnitus. Front. Aging Neurosci. 11, 219 (2019).
Article PubMed PubMed Central Google Scholar
Kidd, G. et al. Determining the energetic and informational components of speech-on-speech masking. J. Acoust. Soc. Am. 140, 132–144 (2016).
Article ADS PubMed PubMed Central Google Scholar
Van de Heyning, P. et al. Incapacitating unilateral tinnitus in single-sided deafness treated by cochlear implantation. Ann. ORL 117, 645–652 (2008).
Google Scholar
Newman, C. W., Jacobson, G. P. & Spitzer, J. B. Development of the tinnitus handicap inventory. Arch. Otolaryngol. Head Neck Surg. 122, 143–148 (1996).
Article CAS PubMed Google Scholar
Freyman, R. L., Helfer, K. S., McCall, D. D. & Clifton, R. K. The role of perceived spatial separation in the unmasking of speech. J. Acoust. Soc. Am. 106, 3578–3588 (1999).
Article ADS CAS PubMed Google Scholar
Zobel, B. H., Wagner, A., Sanders, L. D. & Başkent, D. Spatial release from informational masking declines with age: evidence from a detection task in a virtual separation paradigm. J. Acoust. Soc. Am. 146, 548 (2019).
Article ADS PubMed Google Scholar
Davies, J. E., Gander, P. E. & Hall, D. A. Does chronic tinnitus alter the emotional response function of the amygdala? A sound-evoked fMRI study. Front. Aging Neurosci. 9, 31 (2017).
Article PubMed PubMed Central Google Scholar
Carpenter-Thompson, J. R., Schmidt, S., McAuley, E. & Husain, F. T. Increased frontal response may underlie decreased tinnitus severity. PLoS ONE 10, e0144419. https://doi.org/10.1371/journal.pone.0144419 (2015).
Article CAS PubMed PubMed Central Google Scholar
Carpenter-Thompson, J. R., McAuley, E. & Husain, F. T. Physical activity, tinnitus severity, and improved quality of life. Ear Hear. 36, 574–581 (2015).
Article PubMed Google Scholar
Luo, X., Fu, Q.-J. & Galvin, J. J. I. I. I. Vocal emotion recognition by normal-hearing listeners and cochlear implant users. Trends Amplif. 11(4), 301–315 (2007).
Article PubMed Central Google Scholar
Akeroyd, M. A. Are individual differences in speech reception related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. Int. J. Audiol. 47(Suppl 2), S53-71 (2008).
Article PubMed Google Scholar
Zhang, J. et al. Tonal language speakers are better able to segregate competing speech according to talker sex differences. J. Speech Lang. Hear. Res. https://doi.org/10.1044/2020_JSLHR-19-00421 (2020).
Article PubMed PubMed Central Google Scholar
Tegg-Quinn, S., Bennett, R. J., Eikelboom, R. H. & Baguley, D. M. The impact of tinnitus upon cognition in adults: a systematic review. Int. J. Audiol. 55, 533–540 (2016).
Article PubMed Google Scholar
Willis, S., Xu, K., Thomas, M., Gopen, Q., Ishiyama, A., Galvin, J. J. III & Fu, Q.-J. Bilateral and bimodal cochlear implant listeners can segregate competing speech using talker sex cues, but not spatial cues. J. Acoust. Soc. Am. Express Lett. in press (2020).
Fu, Q.-J., Zhu, M. & Wang, X. Development and validation of the Mandarin speech perception test. J. Acoust. Soc. Am. 129, EL267–EL273 (2011).
Article PubMed PubMed Central Google Scholar
Tao, D. D., Fu, Q.-J., Galvin, J. J. III. & Yu, Y. F. The development and validation of the closed-set Mandarin sentence (CMS) test. Speech Commun. https://doi.org/10.1016/j.specom.2017.06.008 (2017).
Article PubMed PubMed Central Google Scholar
Tao, D. D. et al. Effects of age and duration of deafness on Mandarin speech understanding in competing speech by normal-hearing and cochlear implant children. J. Acoust. Soc. Am. 144, EL131–EL137 (2018).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank all participants for their contribution to this study. This work was partly supported by the National Natural Science Foundation of China (81870726).

Author information

Authors and Affiliations

Department of Otology and Skull Base Surgery, Eye Ear Nose and Throat Hospital, NHC Key Laboratory of Hearing Medicine, Fudan University, 83 Fenyang Road, Shanghai, 200031, China
Yang Wenyi Liu, Bing Wang & Bing Chen
House Ear Institute, 2100 West Third Street, Los Angeles, CA, 90057, USA
John J. Galvin III
Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA
Qian-Jie Fu

Authors

Yang Wenyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Chen
View author publications
You can also search for this author in PubMed Google Scholar
John J. Galvin III
View author publications
You can also search for this author in PubMed Google Scholar
Qian-Jie Fu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Q.F. and B.C. designed the experiments. Y.L. and B.W. collected and analyzed the data. J.G., Y.L., B.C., Q.F. analyzed the data and wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Bing Chen or Qian-Jie Fu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, Y.W., Wang, B., Chen, B. et al. Tinnitus impairs segregation of competing speech in normal-hearing listeners. Sci Rep 10, 19851 (2020). https://doi.org/10.1038/s41598-020-76942-1

Download citation

Received: 12 August 2020
Accepted: 02 November 2020
Published: 16 November 2020
DOI: https://doi.org/10.1038/s41598-020-76942-1

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.