Skill learning represents one of several classes of procedural memory, and is defined as experience-dependent improvement in performance on perceptual, perceptuomotor or motor tasks1. Such learning is one example of memory consolidation2. Once a behavioral training session ends, consolidation of learning continues for some time, and manipulations such as drug treatments can either enhance or reverse this consolidation if administered shortly after training3.

Sleep deprivation hours after training can interfere with consolidation, which suggests involvement of sleep in consolidation4, with rapid-eye-movement (REM) sleep5 and deeper, slow-wave sleep (SWS)6 subserving distinct functions7. In one such procedure8, improvement on a visual discrimination task was only observed after several hours9; overnight improvement was blocked by REM deprivation, although REM sleep was concluded to be a permissive rather than obligatory condition for consolidation of this learning10. We subsequently showed that improvement on this task only occurs when subjects are tested following a night of sleep, and that this overnight improvement is proportional to the amount of SWS in the first quarter of the night and of REM sleep in the last quarter, with these two sleep parameters explaining 80% of the intersubject variance in improvement11. These results suggest that it is the occurrence of sleep, rather than the simple passage of time, that leads to consolidation and improvement on this task. However, these studies were correlative in nature, and did not demonstrate a clear causal link between sleep and improvement. We now report that performance following a single training session improves beyond the first 24 hours, and improves more after a second night of sleep. We show that this improvement is absolutely dependent on the first night of sleep, and that subsequent sleep cannot replace the first night requirement. These findings add to related data from the last three decades, which suggest that sleep after training can be important in consolidation, integration and maintenance of memories5,6,7,12.

Subjects (n = 133) were 18 to 25 years old and gave informed consent before participating in the study, which was approved by the Human Studies Committee of the Massachusetts Mental Health Center. Each subject was trained in a single session lasting 60–90 minutes, and was subsequently tested in a second, identical session, 3 hours to 7 days after training. One group (n = 11) was deprived of sleep from the time of testing until 9:00 p.m. the following evening, and was then allowed unrestricted sleep on the second and third nights, before retesting after 72 hours (Fig. 1). Subjects were monitored during the period of sleep deprivation using the Nightcap ambulatory vigilance monitor, a wallet-sized recording device that accurately identifies periods of wake and sleep13.

Figure 1: Reported time in bed for control and sleep-deprived subjects.
figure 1

Night 0, night before training; night 1, night after training, during which experimental group was sleep-deprived; nights 2 and 3, two subsequent nights before retesting. Dashed line, mean sleep for both groups on night 0. Error bars, s.e.m. Comparison of nights 0, 2 and 3 showed no significant main effect of night (ANOVA, F2 = 2.54, p > 0.05), but a highly significant night × group effect (F2 = 11.87, p < 0.0001). Post hoc tests showed significant sleep rebound for deprived subjects on night 2 (t19 = 3.41, p < 0.005) and then less sleep than controls on night 3 (t19 = 2.12, p < 0.05). Controls showed no significant differences across all four nights (ANOVA, F3 = 1.29, p > 0.2).

In the visual discrimination task8,11, a target screen was displayed for 17 ms, followed by a blank screen for a variable interstimulus interval (ISI), and then a mask, also displayed for 17 ms. The target screen consisted of three diagonal bars in one quadrant of the screen, in either a vertical or horizontal array, displayed against a background of horizontal bars, with the letter 'T' or 'L' displayed at the fixation point (see web supplement After presentation of the mask, subjects were asked to determine whether the fixation letter had been a 'T' or an 'L' and whether the array of diagonal bars had been vertically or horizontally oriented. Subjects were tested over a range of interstimulus intervals, and the minimum ISI required to reach a threshold accuracy on the horizontal–vertical discrimination task of 80% was determined. Improvement for a subject was defined as the decrease in threshold ISI at retest compared to training. Training and test sessions each contained 1250 trials in 25 blocks. To avoid floor effects, subjects with initial thresholds below 30 ms were excluded from analysis. In addition, subjects with excessively high initial thresholds (100–400 ms) were also excluded. Both were pre hoc exclusion criteria.

As previously reported11, no significant improvement was seen when testing occurred on the same day as training (two-tailed t-test, mean improvement, −0.5 ms, t31 = −0.49, p > 0.5), whereas testing on the day following training produced highly significant improvement (mean improvement, 12.6 ms; t47 = 6.6; p < 0.0001; Fig. 2).

Figure 2: Time course of improvement over seven days.
figure 2

Each subject was trained and then tested once during the subsequent week. White bar at day 3, subjects who were tested after three days, but were prevented from sleeping from the time of training until 9 p.m. the following night. Error bars, s.e.m. Day 1 data were from three protocols, two of which have been previously reported11. Improvement was similar in all three protocols (ANOVA, F2 = 0.135, p > 0.8; range, 11.6–14.0 ms).

When testing occurred after longer intervals, even more improvement was observed (Fig. 2, black bars). Subjects tested 2–7 days after training showed significantly more improvement than those tested after only 1 day (18.9 ms versus 12.6 ms; unpaired t-test, t88 = 2.26, p < 0.05). For subjects tested on day 4, the improvement was 69% greater than for subjects tested on day 1; the decrease in improvement from day 4 to day 7 was not significant (quadratic regression, F2 = 2.8, p < 0.10). No significant difference in the initial thresholds measured at the time of training was found among the groups (ANOVA, F5 = 0.96, p > 0.4).

Following one night of sleep deprivation and two full nights of recovery sleep (Fig. 2; day 3, white bar), subjects did not show any significant improvement (mean, 3.9 ms; t10 = 1.06, p > 0.30), whereas control subjects (Fig. 2; day 3, black bar) showed significant improvement (mean, 19.8 ms; t9 = 4.26, p < 0.005). Control subjects showed significantly more improvement than sleep-deprived subjects (one-tailed t-test, t19 = 2.71, p < 0.01). A single night of sleep deprivation permanently aborted the normal consolidation of the learning process. Again, no significant difference in the initial thresholds measured at the time of training was found between the control and sleep-deprived groups (two-tailed t-test, t19 = 1.12, p > 0.2). The failure of the sleep-deprived subjects to show improvement was not the result of residual sleepiness, because Stanford Sleepiness Scale scores14 showed no differences between control and sleep-deprived subjects (mean values of 2.46 and 2.50, respectively; full scale range, 1–7; unpaired t-test, t19 = 0.09, p > 0.9). Nor was there a difference within the sleep-deprived group between the time of training and the time of retesting (mean values of 2.36 and 2.46, respectively; paired t-test, t10 = 0.23, p > 0.8).

Because the sleep-deprived subjects slept normally on nights two and three, one might expect to see at least the incremental improvement during these nights that occurs in control subjects. Indeed, the sleep-deprived subjects showed a non-significant improvement averaging 3.9 ms, but this was only half the increase seen across these nights in controls (7.2 ms). Nevertheless, there was no significant difference between sleep-deprived subjects' minimal improvement over the two nights of recovery sleep and that of control subjects for the same period (t10 = 0.90, p > 0.3). Thus, it remains uncertain whether subjects deprived of one night of sleep can still show some minimal improvement with subsequent recovery sleep.

The results presented here raise the question of when 'learning' actually occurs. Whereas learning clearly begins with the active participation of the subject at the time of training, the consolidation and integration associated with this skill learning must involve multiple steps that continue over at least 48 hours. These begin with attentive awareness during task training and continue with unconscious processing during sleep. Although performance normally continues to improve over the second 24 hours, this second night of sleep cannot replace the lost first night of sleep in producing the large improvement that occurs in controls.

Along with our previous findings11 of a linear relationship between improved performance and the amounts of subsequent early-night SWS and late-night REM sleep, these results suggest that overnight improvement on the visual discrimination task requires at least three temporally distinct steps, the first occurring during initial training and the other two occurring during subsequent early night SWS and late night REM sleep, respectively. This separation of three sequential steps is of particular value to the study of memory consolidation, for it permits the temporal dissection of the stages of memory consolidation and integration required to transform an initial memory trace into a form capable of supporting improved performance. Such temporal dissection can then aid in the localization and characterization of these intermediate steps. The nature of these changes remain to be determined, along with how the unique physiology and chemistry of these wake–sleep states facilitate those processes that consolidate and integrate learned skills.

Note: supplementary information is available on the Nature Neuro-science web site (