Impaired encoding of rapid pitch information underlies perception and memory deficits in congenital amusia

Recent theories suggest that the basis of neurodevelopmental auditory disorders such as dyslexia or specific language impairment might be a low-level sensory dysfunction. In the present study we test this hypothesis in congenital amusia, a neurodevelopmental disorder characterized by severe deficits in the processing of pitch-based material. We manipulated the temporal characteristics of auditory stimuli and investigated the influence of the time given to encode pitch information on participants’ performance in discrimination and short-term memory. Our results show that amusics’ performance in such tasks scales with the duration available to encode acoustic information. This suggests that in auditory neuro-developmental disorders, abnormalities in early steps of the auditory processing can underlie the high-level deficits (here musical disabilities). Observing that the slowing down of temporal dynamics improves amusics’ pitch abilities allows considering this approach as a potential tool for remediation in developmental auditory disorders.

Auditory perception and memory have been described as relying on several processing steps in which information has to be (a) extracted by the perceptual systems (extraction of auditory attributes) 24 , (b) maintained in echoic memory, where a pitch memory trace of the sound is established 25,26 , and (c) stored in auditory short-term memory for several seconds or minutes 27 . Within this framework, pitch discrimination and short-term memory might share at least the first two mechanisms. Indeed, a simple pitch discrimination paradigm requires the comparison between the memory traces of a previously heard stimulus and a present stimulus 28 . Similarly, in the context of a short-term memory task for tone sequences, each tone of the to-be-remembered sequence has to be kept in memory (memory trace). This allows for the efficient encoding of the entire sequence that can then be actively maintained in short-term memory for several seconds.
Recent neurophysiological studies in congenital amusia showed impaired encoding of tones both at the level of the brainstem and at the level of the auditory cortex. Auditory brainstem responses to complex sounds are sometimes reduced and delayed in amusic individuals as compared to controls 29 (but see 30 ). Some abnormalities have also been reported at the cortical level: when encoding the first melody of a pair in a short-term memory task, the amusic brain elicited reduced and delayed N100 m components in bilateral auditory cortices and inferior frontal gyri 22 . This was interpreted as reflecting less efficient encoding of the auditory information that might negatively impact subsequent processing steps within short-term memory, namely retention and retrieval 31 . Based on this observation, it could be hypothesized that impaired encoding of pitch information in amusia leads to the general impairments observed for pitch processing (for both pitch discrimination and memory tasks).
In previous psychoacoustic research with typical listeners, it has been shown that the capacity of discriminating and memorizing short sounds is related to the duration of the to-be-encoded material. For short sounds (< 300 ms), the amount of time required to construct an appropriate memory trace of that sound can exceed the duration of the sound itself 28 . Indeed, for typical individuals, the detection of pitch changes between successive short tones is facilitated by the introduction of a silent gap (inter-tone-interval, ITI) between the tones or by increased tone durations 28 .
In the present study, we aimed to further characterize amusics' pitch deficits by investigating to what extent discrimination and memory impairments are related to the temporal dynamic of the to-be-encoded pitch information. In Task 1, ten amusics and ten matched control participants were required to indicate whether two consecutive tones (presented without any ITI) were the same or different. We manipulated tone duration (100 ms, 350 ms) and task difficulty (pitch interval size between the tones of one or two semitones). In Task 2, the same amusic and control participants were required to compare two tone sequences separated by a 2-s delay. Over four different blocks, we manipulated tone duration (100 ms, 350 ms) and ITI (present or absent), which also resulted in changes in stimulus onset asynchrony (SOA, corresponding to the sum of tone duration and ITI). It has been demonstrated that short-term memory abilities decrease with increasing memory load for auditory and visual modalities 18,27,32 . We thus manipulated the sequence length (three or four tones) to test whether the benefits of an increased time to encode tone sequences can be observed in particular when the task difficulty increased. Note that the pitch changes for Task 2 were always larger than 3 semitones, thus above amusic participants' pitch discrimination thresholds.
If encoding of rapid pitch information is altered in congenital amusia, we predict: a) impaired performance in amusics compared to controls when the time to encode the information is short, across both discrimination and memory tasks; b) better task performance in both amusics and controls with increased time to encode the information (duration and/or ITI, and hence SOA). Furthermore, this would also suggest that by increasing the duration of stimulus parameters (tone duration, ITI, SOA) sufficiently, amusics might be able to perform normally (at the level of controls) on pitch discrimination and memory tasks.

Results
Task 1: Single tone comparison. In Task 1, participants had to determine whether two tones (played with a piano timbre) presented without an ITI were the same or different (Fig. 1A). The task was divided into four blocks between which tone duration (d = 150 or 350 ms) and pitch interval sizes (∆ = one or two semitones) were manipulated.
Percentages of Hits-FAs (Fig. 2) were analyzed with a 2 × 2 × 2 ANOVA with group (amusics, controls) as the between-participants factor and tone duration (100 ms or 350 ms) and pitch interval size (∆ = one semitone or two semitones) as within-participant factors.
The . While amusics' performance was significantly decreased in comparison to controls for the most difficult block (d = 100 ms, ∆ = one semitone, P < 0.0001), this was not the case for the easiest block (d = 350 ms, ∆ = two semitones, P = 0.08, see Fig. 2). Moreover, while amusics showed better performance for the long tone duration in comparison to the short tone duration for both pitch interval sizes (all P-values < 0. 0001), controls showed this pattern only for the smaller pitch interval size (∆ = one semitone, p = 0.05) and not for the larger interval size (∆ = two semitones, Scientific RepoRts | 6:18861 | DOI: 10.1038/srep18861 p = 0.35 )). This latter effect can be related to a ceiling performance for the larger pitch interval size, for which controls' performance was not significantly different from a perfect score (100%, all P-values > 0.06).
To assess the potential differential benefit of tone duration and pitch interval size in both groups and between groups, we performed the following subtractions on the %Hits-FAs data: ( . While amusics showed a stronger benefit of increased tone duration than did controls (P = 0.005), the benefit of increased pitch interval size did not differ Examples of same and different trials for the short-term memory task for the sequence length of 3 tones. For "same" trials, S1 was repeated as the second melody of the pair (S2) after a 2000 ms silent delay. For "different" trials, one tone in S2 was altered to change the melodic contour. Tones are represented as waveforms and their fundamental frequencies are illustrated with colored lines. between the two groups (P = 0.58). In addition, the benefit of increasing tone duration was significantly greater than the benefit of increasing pitch interval size in amusics (P < 0.0001), while this was not the case in controls (p = 0.08). Amusics' performance thus benefited more from increasing stimulus duration than from increasing pitch change. This suggests that the temporal dynamic of pitch information has a more important impact than the spectral information on amusics' pitch discrimination abilities. Task 2: Short-term memory tasks for tone sequences. In Task 2, participants performed a melodic short-term memory task for which they had to compare two tone sequences (S1 and S2; played with a piano timbre) separated by a silent retention period of 2000 ms (Fig. 1B). To manipulate task difficulty, sequences were composed of either of three or four tones. For each sequence length, there were four blocks differing in tone duration and/or ITI. The duration of the tones was either 100 ms (b1 and b2) or 350 ms (b3 and b4), and they were presented either without an ITI (b1 and b3) or with an ITI (b2 and b4), resulting in a range of different SOAs (see Fig. 3). Note that for blocks b2 and b3, the SOA between tones was equal, allowing us to disentangle the contribution of tone duration and SOA in short-term memory performance.
%Hits-FAs were computed as described above. Data were analyzed with a 2 × 2 × 2 × 2 ANOVA with group (amusics, controls) as the between-participants factor and sequence length (three-tones, four-tones), tone duration (100 ms, 350 ms), and ITI (present, absent) as within-participant factors. The data are summarized in Fig. 3.
The To analyze this 3-way interaction, post hoc tests were performed and revealed that for the easier task (three-tone sequence length), both groups exhibited similar performance for short tones with an ITI and for long tones with or without an ITI (all P-values = 0.44). However, for the more difficult task (four-tone sequence length), performance for short tones with an ITI was reduced in comparison to performance for the long tone duration with or without an ITI (all P-values < 0.01). Additionally, this analysis revealed that for both sequence lengths (three-tones, four-tones), amusics and controls' performance was increased with the presence of an ITI only for the short tone duration (three-tones, P < 0.0001; four-tones, P < 0.0001) and not for the long tone duration (three-tones, P = 0.44; four-tones, P = 0.46).
Correlations. Two sets of correlations were done. First we performed the correlation between data of Task 1 and 2 to investigate the link between pitch discrimination and pitch memory. Correlations involving data from Tasks 1 and 2 were computed with the average of %Hits-FAs over all conditions (of each task, respectively). We then correlated data of Tasks 1 and 2 with behavioral data from previous testing sessions (Montreal Battery of Evaluation of Amusia 33 and Pitch Change Detection task (PCD, see Table 1) from Hyde and Peretz 12 (see also 1,[34][35][36]). This analysis aimed to investigate whether participants' scores on diagnostic tests (that require both discrimination (PCD) and memory (MBEA)) can predict their performance on Tasks 1 and 2.
The results from Tasks  Finally, data of Task 2 were positively correlated with data of the PCD over all participants (r(18) = 0.65, P = 0.002) and in amusics for the 1/4 semitone pitch interval size (r(8) = 0.69, P = 0.027) (see Supplementary information Figure 1F).

Discussion
The present study investigated whether amusics' deficits in pitch discrimination and short-term memory are related to impaired encoding of rapid auditory information. For both tasks, amusics exhibited impaired performance compared to controls when the time to encode pitch information was short. Amusics and controls showed pronounced improvements in terms of accuracy with increasing tone duration and/or ITI. These benefits were observed independently of the task difficulty (pitch interval size for Task 1; sequence length for Task 2). Most interestingly, when enough time was given to encode the pitch information (350 ms and more), amusics were able to reach normal performance in both tasks.
Task 1 investigated whether tone duration (equal to SOA in this task, as there was no ITI) could affect amusics' pitch discrimination abilities for two different task difficulties (i.e., different pitch interval sizes). Amusic participants showed decreased performance in comparison to controls, but this impairment was dependent on tone

Characteristics
Amusics (n = 10) Controls (n = 10) t-test duration. While amusics exhibited decreased performance for the short tone duration (d = 100 ms), their performance did not differ from that of controls for the long tone duration (d = 350 ms). It is relevant to note, however, that this latter effect might possibly be an artifact due to a ceiling effect in controls. Nevertheless, the crucial role of tone duration in amusics' pitch discrimination abilities materialized in the comparison of the benefit of tone duration and of pitch interval size, respectively. We observed that the increase in tone duration had a stronger benefit on amusics' performance than the increase in pitch interval size, whereas for the chosen parameters, the two benefits were not significantly different in controls. The benefit of increasing tone duration on participants' performance is in line with numerous psychoacoustic studies for typical listeners [37][38][39][40][41][42][43] showing that auditory discrimination abilities benefit from increasing SOA (or here, tone duration). This effect can be interpreted in terms of reduced 'backward masking effect' 28,38,40,44 when the time to encode the information is long enough. When normal listeners are given the several hundreds of milliseconds needed to construct a proper memory trace of the pitch of a newly heard sound, the representation of this sound is optimal. In contrast, if a second sound is presented too soon after the first one, the perceptual analysis of the first sound would be prematurely stopped ('backward masking effect').
In agreement with these principles described for normal listeners, Task 1 showed that amusics' pitch processing benefitted from long tone durations (as recently suggested by 17 ), and by extension, a slower rate of presentation of tones. The findings that amusics exhibit decreased performance compared to controls for the short tone duration, but not for the long tone duration, and that the benefit of increasing tone duration is stronger for amusics than for controls suggests that the time constraints for pitch encoding might differ between the two groups. Amusics may need more time than controls to properly encode the sounds (construct a proper memory trace). The longer tones would allow for such reliable representations of pitch to be formed, and would in turn lead to increased discrimination capacity. The importance of time to encode pitch information in congenital amusia is consistent with previous studies, notably those investigating pitch thresholds. Table 2 lists these prior studies as a function of SOA. When performing the correlation between 19 threshold values available in these studies (we considered only values that were available in the main text or in tables of the articles, see Table 2) and SOA, a clear pattern emerges despite the fact that these studies used diverse materials and different tasks. Amusics' pitch thresholds are negatively correlated with the duration of SOA (r(17) = − 0.48, p = 0.03) (Note that this effect is not significant in controls (r(17) = − 0.34, p = 0.15)). While amusics exhibit increased (worse) pitch thresholds in comparison to controls for all SOAs (except for unresolved harmonics-see 17 ), their threshold values are getting better (lower) when the SOA between the to-be-compared tones increases (pitch thresholds varying from 4.72 semitones with short SOA (150 ms) to 0.28 semitones for long SOA (1200 ms). This implies a beneficial impact of the temporal presentation rate of tones on amusics' pitch discrimination abilities. Data of Task 1 and of the studies listed Table 2 thus suggest that amusics exhibit altered encoding of rapid auditory information that impairs their performance in tasks requiring pitch discrimination (as well as pitch change detection or pitch direction judgments). Moreover, they suggest that increasing the time to encode pitch information facilitates simple pitch processing in congenital amusia, with amusics able to reach performance levels that are comparable to those of controls when more time to encode the sounds is available (see Table 2 and Supplementary information Table A).
In addition to evaluating the impact of stimulus duration on amusics' performance in a single pitch discrimination task, the present study investigated whether this parameter can affect amusics' short-term memory abilities for melodies. Task 2 investigated whether participants' short-term memory performance could vary as a function of tone duration (100 ms, 350 ms), ITI (present or absent), and SOA for two different task difficulties (3-tone and 4-tone sequences).
For both sequence lengths, tone duration and ITI had an impact on participants' performance. Amusic and control groups showed better accuracy for 1) the long tone duration than for the short tone duration, and 2) tone sequences with an ITI (see Fig. 3 Table B) as compared to sequences without any ITI. Furthermore, our data revealed that tone duration might have a more critical impact than SOA on participants' performance. While accuracy was similar for blocks with the same SOA ([short tones with ITI] and [long tones without ITI]) for the easier condition (three-tone sequences), this was no longer the case for the difficult condition (four-tone sequences). For this latter condition, participants' performance for the block [long tone duration without ITI] was better than for the block [short tone duration with ITI]. This suggests that when task difficulty increases, long tone duration is more useful for properly encoding the auditory information than is the addition of a silent gap after a short tone. More interestingly, this critical role of tone duration interacted with participant groups. While amusics showed strongly impaired performance in comparison to controls for the short tone duration, they were performing as well as controls for the long tone duration.

and Supplementary information
Based on the findings in both tasks and on the positive correlations observed between data from Tasks 1 and 2 (and with data from the pre-tests), we argue that congenital amusics' deficits in both single pitch discrimination and short-term memory tasks are underlined by impaired encoding of rapid auditory information. These results are remarkably similar to those reported in other developmental disorders, such as dyslexia, specific language impairment (SLI) [45][46][47] and language learning impairments (LLI) 48 . Indeed, deficits in rapid auditory processing (RAP theory) have been described in these disorders, based on their difficulty in processing brief, rapidly changing acoustic information [45][46][47][48][49] . The earliest indication of this phenomenon was the finding that the performance of language impaired and dyslexic children is inferior to that of control participants in tone-sequence tasks when the SOAs are below 400 ms, but performance was normal at longer SOA 46,47 . Additionally, research investigating children with specific language impairment 50,51 has demonstrated that the children's ability to process rapidly arriving (within a time window of ~40 ms) auditory information is impaired as compared to that of control children. More recently, longitudinal and cross-sectional studies have shown that basic temporal processing discrimination in infants predicts later language outcomes 52,53 . Based on these findings, it has been proposed that deficits in the ability to perceive rapidly changing acoustic differences are either a cause (see 54 for review) or a consequence 55 of language impairments, affecting language comprehension and reading ability. Observing a similar pattern of results in congenital amusics is of interest because it suggests that while these deficits are not observable for the same material (music for amusia, speech for specific language impairment and dyslexia), developmental difficulties in each of these disorders are related to altered processing of auditory information that arrive rapidly and sequentially. Further work is thus necessary to understand the potential relationships between amusia, dyslexia and specific language impairment.
Finally, given the similar performance between amusics and controls when the time to encode pitch sequences is long (350 ms and more), it could be argued that slowing down the presentation of the pitch information might improve amusics' musical abilities. However, this hypothesis can be challenged, especially when considering that the long tone duration (350 ms) used in the present study is similar to: 1) the profile of tone durations used in Western tonal music, 280 ± 291 ms 56 and 2) the average duration of tones constituting the non-polyphonic melodies in the MBEA (mean = 379.8 ms ± 0.19 ms), for which amusics exhibit strong deficits 33 .
When considering previous studies investigating short-term memory processing for melodies in congenital amusia (see Table 3), it can be hypothesized that other parameters, such as sequence length and/or the duration of the silent retention delay, might further influence amusics' short-term memory performance. In the present study, the tone sequences were composed of only 3 or 4 tones, thus constituting rather simple material in terms of melodic information and memory load. In previous studies, as listed in Table 3, the sequences were longer (5 tones in Gosselin 16,18,23 ; 6 tones in 22 ; 7 to 21 in the MBEA Peretz 33 ), and therefore more complex. It may be that slowing down the pitch information is sufficient to help the encoding of short sequences, but would not be sufficient for Scientific RepoRts | 6:18861 | DOI: 10.1038/srep18861 longer sequences. In other words, long tone duration might not fully restore a 'normal pitch memory trace' , but could lead to a trace that is nevertheless sufficient for encoding less complex material. Further work is needed to understand the precise role of memory load and length of retention period on amusics' pitch processing abilities for more complex musical materials.
The present study showed that amusics' deficits in pitch discrimination and memory are related to an impaired encoding of rapid auditory information. More research is now necessary to understand the potential relationships between congenital amusia and other disorders that exhibit similar patterns of temporal deficits (eg., developmental language disorders) and to ascertain whether similar difficulties in processing rapidly presented auditory signals can be generalized to other types of material (such as speech) in congenital amusia.

Materials and Methods
Participants. Ten amusic participants and ten non-musician control participants matched for age, handedness, educational background (years of education), and musical training (years of musical instruction: teaching/ practice) participated in the study. The amusic group (age range: 62 to 72 years) and the control group (age range: 61 to 75 years) were right-handed francophone participants from Montreal and surrounding areas. Participants reported no history of neurological or psychiatric disease, and had audiometric thresholds below 30 dB HL for frequencies below or at 4 kHz. Data from the pre-tests are presented in Table 1 and Supplementary information  Table A. Participants gave their written informed consent and were paid for their participation. The research was carried out in accordance with approved guidelines of the Comité d'éthique de la recherche en arts et en sciences (CERAS) of the Université de Montréal. Ethical approval was obtained from the CERAS committee of the Université de Montréal.

Equipment.
The experiments took place in a sound-attenuated booth, and auditory stimuli were presented via SENNHEISER HD 280 pro headphones at 65 dB SPL. Presentation software (Neurobehavioral Systems, Berkeley, CA, USA) was used to control the stimulus presentation and record participants' responses in the form of mouse button presses. Task 1: Single Tone Comparison. There were 64 trials (32 same, 32 different) in each block. For different trials, the second tone was equiprobably upper or lower in pitch than the first tone by ∆. The choice of these pitch interval sizes was based on results of our participants on the PCD task (d = 100 ms, ITI = 250 ms, SOA = 350 ms) showing that they reached normal pitch detection performance for pitch interval sizes of one semitone and larger ( Table 1). Note that this pattern of results differs from that described in 4 and 1 , where amusics showed decreased performance in comparison to controls for the one-semitone interval. The amusic group participating in the present study thus exhibit better pitch change detection abilities than the group from these previous studies (see Supplementary information Table A).
Tone pairs were created using eight piano tones differing in pitch height (Cubase software, Steinberg), but all belonging to the key of C Major (C3, D3, E3, F3, G3, A3, B3, C4, frequency range from 130.81 Hz to 261.63 Hz). This material was chosen to allow for comparisons between Task 1 and Task 2 (as Task 2 was a tonal melodic task, see below).  Task 2: Short-term memory task with tone sequences. For each sequence length (three-tone sequences, four tone sequences) 128 different diatonic melodies (presented as the first sequence, S1) were created using the piano tones of Task 1. For each block, there were 32 trials (S1, silence, S2), with 16 same and 16 different trials. Participants were asked to indicate whether S1 and S2 were the same or different. For different trials, one tone in the S2 melody was different from the S1 melody and created a contour-violation in the tone sequence. The change on different trials occurred in position two or three of the sequences, regardless of sequence length, and the change position was equally distributed across trials. For different trials, the pitch interval sizes (see ∆ in Fig. 2B.) were larger than 3 semitones (thus, above amusic participants' discrimination thresholds) and controlled in such a way that no significant differences of interval sizes were observed between blocks and sequence lengths (all P-values > 0.10; mean pitch interval size across blocks = 6.7 semitones; SD = 1.1 semitones).
Procedure. Participants performed Task 1 first, followed by Task 2, on the same day of testing, and the entire session lasted approximately one hour. For each task, the order of presentation of the blocks was counterbalanced across participants (latin square), and blocks were separated by breaks of 2-3 minutes. Participants were informed of the stimulus characteristics before a given block: tone duration (short, long) and pitch interval size (small, large) for Task 1; sequence length (3-tone, 4-tone), tone duration (short, long), and ITI (present, absent) for Task 2. They were asked to respond by mouse button presses with their right hand after the end of the auditory stimulation. There was no time limit to respond, and participants pressed the middle mouse button to launch the next trial. Before each block, participants performed four practice trials with error feedback, but no feedback was given during the experiment. Within each block, the trials were presented in a pseudo-randomized order: the same trial type (i.e. same, different) could not be repeated more than three times in a row.
Performance in both tasks was evaluated as percentages of Hits (correct responses for different trials/ number of different trials) minus percentages of False Alarms (FAs, incorrect responses for same trials/number of same trials). Percentages of Hits-FAs were analyzed with a repeated measure ANOVA. To analyze significant interaction, post hoc tests were performed using Fischer LSD. For all the measures of interest we tested if the samples (controls and amusics) were derived from a population normally distributed using Kolmogorov-Smirnov tests. These analyses revealed that all data were normally distributed (P > 0.20). Furthermore, we computed correlations over all participants and for the amusic and control group, separately.