Information acquired during waking can be reactivated during sleep, promoting memory stabilization. After people learned to produce two melodies in time with moving visual symbols, we enhanced relative performance by presenting one melody during an afternoon nap. Electrophysiological signs of memory processing during sleep corroborated the notion that appropriate auditory stimulation that does not disrupt sleep can nevertheless bias memory consolidation in relevant brain circuitry.
Spontaneous memory reactivation during sleep may improve many types of memory storage1,2. Sleep may be particularly relevant for skill acquisition. For example, a retention interval with sleep relative to one without sleep benefits rapidly tapping a five-element sequence3,4 and integrating sensory and motor elements5,6.
Sleep also aids sensorimotor integration in songbirds7,8. Learning a new song causes subsequent changes during sleep in the activity of a premotor nucleus that has been implicated in song production, resulting in overnight improvement7. Song playback during sleep elicits similar neural activity, constituting induced neuronal replay in learning-related circuits8.
Reinstating information during sleep also produces memory reactivation in humans. Spatial memories are enhanced when an olfactory context9 or specific sounds10 are present during both learning and sleep. Odors and sounds in these experiments presumably reactivate relevant neuronal representations, akin to the reactivation of rodent spatial representations evidenced by specific hippocampal firing patterns during sleep following spatial learning2.
In human music learning, sensorimotor integration occurs as a musician learns to link specific movements with written notes and auditory outcomes. Both auditory and motor circuits are engaged by passive listening and silent production of previously practiced melodies11,12, suggesting interactions between neural representations for perception and action. The extent to which this sort of neural plasticity can be facilitated during sleep is unknown.
Based on these separate lines of evidence, we hypothesized that the ability to produce a melody could be influenced by auditory cuing during sleep. We compared performance for two melodies practiced for the same amount of time. Right-handed individuals learned to play these melodies by pressing four keys in time with repeating 12-item sequences of moving circles (Fig. 1a, Online Methods and Supplementary Movie 1). During an afternoon nap, when electroencephalographic (EEG) recordings showed indications of slow-wave sleep (SWS), one of the melodies was covertly presented 20 times over a 4-min interval.
Practice enhanced general performance skills for the task and participants also gained sequence-specific knowledge evident when performance errors were scored (Fig. 1b). Accuracy was superior for the average of the two learned sequences compared with a baseline sequence (before sleep, P = 0.03; after sleep, P = 0.005; two-tailed paired t tests, n = 16).
As predicted, accuracy for the two melodies was systematically influenced in accordance with which melody had been presented during sleep; performance was more accurate for the cued than for the uncued sequence (P = 0.04). This result can be attributed to melody presentation during sleep, as procedures were otherwise the same for the two melodies and accuracy did not differ before the nap (P = 0.72).
Sleep cuing effects were also evident in improvement scores computed by subtracting each individual's mean score across all pre-nap learning periods from their post-nap score. Improvement scores were significantly greater for the cued melody than for the uncued melody (cued, 7.9 ± 2.4%; uncued, 2.6 ± 1.5%; P = 0.02).
Neurophysiological results reinforce the idea that sleep cues influenced memory storage. We computed a reactivation advantage score for each subject, subtracting the cued melody improvement from the uncued melody improvement. This measure of cuing efficacy was positively correlated with the percentage of time in SWS across subjects (Fig. 2a). In a separate group of subjects who completed the protocol without sounds during sleep, the average improvement likewise correlated with the percentage of time in SWS (Supplementary Fig. 1). The overall performance improvement in this group (4.4 ± 1.8%) was intermediate to that in the cued and uncued conditions in the main experiment, but across-group differences were not significant (cued, P = 0.22; uncued, P = 0.31). A period of sleep may provide a finite capacity for memory processing such that cued reactivation tends to produce a bias rather than a pure gain, but this speculation requires further study. The reactivation advantage was also correlated with spindle activity during SWS (Fig. 2b). However, we found no evidence that cue presentation prompted an increase in spindles, as SWS spindle density was similar with versus without sleep cues (with cues, 0.95 ± 0.67 spindles per min; without cues, 1.22 ± 0.93 spindles per min; P = 0.3). Although neural sources of this spindle activity are difficult to know with certainty, scalp topographic patterns (Fig. 2c) suggest that the spindle correlation may reflect memory processing in or near premotor cortex contralateral to the performance hand, consonant with links between this region and perception-action integration in songbirds7,8, monkeys13 and humans11.
Learning-related cues presented during SWS in prior studies strengthened spatial associative memory9,10. Our findings extend this phenomenon by showing that auditory cues can selectively change the ability to perform a distinct type of sensorimotor skill memory. Tones were an intrinsic aspect of the learning accomplished (not background context cues9). Unlike previously studied procedural tasks, correct responses required precise synchronization with dynamic visual information. Similar to most skills learned outside of the laboratory, this skill was acquired intentionally and may reflect a combination of procedural and explicit memory. After learning, subjects registered above-chance explicit memory for the melodies, although these measures failed to show evidence of differential memory based on cuing (Supplementary Fig. 2).
We did not address the effectiveness of cuing during other sleep stages, but SWS has been recognized as being critical for systems memory consolidation1. Procedural memories also undergo consolidation during sleep14, and slow-wave activity may be beneficial for procedural memories that require sensorimotor integration5,15. Accordingly, given that a sound sequence was associated with an action pattern during the course of learning in our procedure, we propose that subsequent sound presentation during SWS may have altered systems-level neural plasticity in circuits involved in sensorimotor integration.
The potential for various applications of these methods brings up many questions. Can suitable sleep cues improve learning for musical, athletic, linguistic or other skills in which substantial expertise is gradually developed? Can useful changes accrue over longer periods with extended stimulation protocols? Do sleep cues ever have detrimental effects on sleep quality? Beyond the possible practical utility of methods for memory reactivation, our primary hope is that such methods can be applied to elucidate the endogenous brain mechanisms operative every night for reinforcing daytime learning.
We asked 16 subjects (six female, mean age = 21 years) to abstain from caffeine the morning of the experiment, which began in the afternoon. Data from 15 other subjects were excluded because 12 did not reach SWS, two woke during sound presentation and one produced near-perfect performance before the nap. Subjects gave informed written consent and were selected without regard to their history of musical training, which ranged from no formal training to many years of training on a specific musical instrument (mean musical experience = 4.9 years).
Procedure and design
Following preparation for electrophysiological recordings, each subject heard one low- and one high-pitched melody selected from among two high- and two low-pitched melodies. All sequences were 12 items long, three for each button, with a second-order conditional relationship for button transitions (that is, each two-button combination occurred exactly once and predicted the next button). Each melody had a different melodic contour, rhythm and pitch set (the first four notes of different major western musical scales; Fig. 1a). The subject listened passively to each sequence repeated five times each.
Musical performance training began next, as a screen 168 cm away from where the subject was seated showed circles that ascended in four columns toward four stationary circle outlines (targets). Subjects were informed that performance accuracy for the two sequences would be assessed repeatedly. Subjects attempted to press a keyboard button corresponding to each target during circle-target overlap, as in previously studied variations of this task16,17. Button presses evoked a musical tone only if a response occurred during the interval from 200 ms before to 150 ms after complete circle-target overlap.
In each of four training blocks, a 1.4-s pause followed every five repetitions. Blocks included five repetitions of a sequence followed by another five repetitions of a sequence (random order was either ABAB, ABBA, BABA or BAAB). After block 2, less advance warning was given before circle-target overlap. The amount of advance warning (that is, size of mask; Fig. 1a) was individualized such that better performers received less advance warning than poorer performers. The time from appearance of a circle until when it reached the target location was initially 2,056 ms and was either 370, 498 or 625 ms after block 2 (for each participant, the advance warning was the same for all melodies; reactivation advantage did not differ as a function of amount of advance warning, P = 0.1).
In block 4, a novel sequence was included so that learning specific to the trained melodies could be distinguished from baseline improvement resulting from general learning to track circles and press buttons accordingly. A random order was selected for the three sequences (one novel sequence and two learned sequences) and each sequence was presented five times in succession. Then, each sequence was again presented five times in succession (in a different random order with the constraint that the novel sequence was always last, minimizing any recency advantage for one of the trained sequences in explicit memory tests). Block 4 constituted the pre-nap test. The post-nap test was administered in the same way, except that a different novel sequence was used. Every block concluded with explicit recall testing (see below).
The sleep period began after block 4. The chair was converted to a bed supplied with pillow and blanket. No reference was made to sound presentation other than unobtrusive white noise during the nap. Our design focused on SWS, as in prior studies9,10,18, although some studies found that cues during REM sleep improved memory with respect to a complex logic task19, a Morse code learning task20 or fear conditioning in rats21. Approximately 9.3 ± 1.9 min after indications of SWS were observed (Supplementary Fig. 3), the experimenter initiated one of the learned melodies (selected in advance according to a randomized schedule counterbalancing pitch). Sleep staging was verified formally offline (Supplementary Table 1). Sound intensity was similar to that of the background noise (approximately 35 dB sound pressure level [A]), and when each tone was played, white-noise intensity was lowered slightly. The sleep period was terminated after 90 min, but if the subject was in SWS, awakening was delayed until this stage ended. Post-nap testing began 10 min later.
Because sleep-cuing improved explicit memory in other tasks9,10,18, we quantified explicit knowledge of each sequence using free recall and recognition measures. Performance recall was tested after each block and after napping. Recall was also tested on paper after pre- and post-nap tests. Recognition was tested at the end of the experiment. There were no time constraints.
In the performance recall test, subjects attempted to key in each 12-item sequence while their responses evoked the usual tones. In the paper recall test, subjects attempted to indicate the 12 notes by marking one circle in each row of a 4 × 12 grid. Recall scores were calculated as the number of matches with the correct sequence (ignoring rhythm). Chance was estimated, for each subject's answer, as the average number of matches with all 256 possible sequences fulfilling a second-order conditional.
In the recognition test, the learned sequence and four foil sequences were presented in a random order (Supplementary Fig. 4). After hearing each one, subjects rated it on a five-point scale of similarity to the learned sequence. Subjects were asked to use all five ratings. Recognition scores were determined using a mathematical algorithm of similarity distance (similarity response − correct level of similarity) that emphasized ability to recognize the original sequence. The mean score for all possible response combinations was used to estimate chance performance.
Tin electrodes in an elastic cap were placed at 21 standard scalp locations, left and right mastoids, adjacent to the eyes, and on the chin. Electrophysiological data were sampled at 1,000 Hz with a band-pass of 0.1–100 Hz, down-sampled to 250 Hz, and re-referenced to average mastoids. EEG spectral power was calculated for each sleep stage (and before, during and after sleep sounds) in five bands: slow (0.5–1 Hz), delta (1–4 Hz), theta (4–8 Hz), alpha (8–12 Hz) and sigma (12–15 Hz).
Spindles, which are bursts of EEG oscillations at 12–15 Hz lasting 0.5–2.0 s, are thought to underlie consolidation3,22 and can occur locally during non-REM sleep3,23. Spindles over contralateral premotor cortex can reflect visuomotor skill learning22. Spindles were quantified using a MATLAB/EEGLAB algorithm24,25. After filtering the rectified EEG signal at 12–15 Hz, the algorithm identifies amplitude fluctuations exceeding lower and upper thresholds of two and eight times the average amplitude, respectively. A spindle is counted if the amplitude exceeds the upper threshold at some point during a period when amplitude remains above the lower threshold for at least 0.5 s. A topographic map was computed by interpolation using the correlation coefficient between the reactivation advantage and the total number of SWS spindles at each electrode.
Accuracy results were computed as the percentage of circle presentations for each melody when the correct key (and no other key) was pressed in the 350-ms response interval. Unless otherwise stated, statistical analyses for behavioral and EEG data relied on two-tailed paired t tests (alpha set to 0.05).
A separate group of 16 subjects (ten females, mean age = 21.3 years, mean musical experience = 6.0 years) followed the same protocol, but stayed awake during the 90-min period after learning (Supplementary Fig. 5 and Supplementary Results). During part of this time they performed a demanding working-memory task. A different random digit was displayed every 900 ms, continuously flickering at 16.6 Hz (30 ms on, 30 ms off). Subjects attempted to press one button if the current digit and the prior digit were both even or both odd and another button if one was even and one odd. Following a 1-min practice run, there were three 4.5-min runs. Subjects were asked to maintain their focus on the task and were given feedback on accuracy at the end of each run to reinforce this focus. While subjects completed this task, white noise was covertly played with embedded musical cues beginning 15 s after the start of the second run (20 repetitions of one melody). To control for timing, the onset time for the second run for each subject was yoked to the approximate time after training for a subject from the sleep group.
Sleep no-sounds experiment
A third group of 16 subjects (nine females, mean age = 19.8 years, mean musical experience = 5.7 years) followed the same protocol as the sleep sounds group, but did not receive cuing during a 90-min sleep period. Data from five other subjects were excluded because three did not reach SWS and two produced poor performance at least four s.d. below the mean before the nap (both below 25% correct). Sleep staging, spectral analyses and spindle analyses were performed identically to that in the sleep sounds group. The time when sleep cues would have been presented was estimated by taking a time interval after the onset of SWS for each subject yoked to a corresponding time interval in the sleep sounds group.
We thank B. Mander, J. Saletin, S. Greer and D. Oudiette for technical help. This material is based on work supported by National Science Foundation grant BCS1025697, National Institute of Aging grant T32-AG020418 and National Institute of Neurological Diseases and Stroke grant T32-NS047987.
A real-time movie of task performance.
About this article
Science China Life Sciences (2018)