Abstract
Statistical learning is a robust mechanism of the brain that enables the extraction of environmental patterns, which is crucial in perceptual and cognitive domains. However, the dynamical change of processes underlying longterm statistical memory formation has not been tested in an appropriately controlled design. Here we show that a memory trace acquired by statistical learning is resistant to inference as well as to forgetting after one year. Participants performed a statistical learning task and were retested one year later without further practice. The acquired statistical knowledge was resistant to interference, since after one year, participants showed similar memory performance on the previously practiced statistical structure after being tested with a new statistical structure. These results could be key to understand the stability of longterm statistical knowledge.
Introduction
Statistical learning is a fundamental mechanism of the brain which extracts and represents regularities of our environment. It is crucial in perception^{1,2,3,4,5}, associative learning^{6}, predictive processing^{3, 6, 7}, and acquisition of perceptual, motor, cognitive, and social skills; thus, statistical learning underlies many daytoday activities during the entire lifespan^{8,9,10,11}. Moreover, statistical learning could be considered as the basis of language acquisition^{11, 12}. Despite the extensive research on this field, the strong implicit assumption that statistical learning leads to persistent memory has not yet been empirically tested in a carefully controlled experimental design, and the dynamics of those mechanisms underlying consolidation have remained unclear. Here we show direct evidence for the oneyear retention and resistance to interference of a memory trace that was acquired by statistical learning in humans.
An important challenge of neuroscience is to unravel how plasticity leads to memory formation, what are the temporal dynamics of memory formation, and how longterm memory traces are retained. Learningrelated plastic changes in the brain take place not only during sessions of practice, in the socalled “online” periods, but also between sessions of practice, during the socalled “offline” periods^{13}. Offline processing of learnt information is referred to as consolidation, which pertains to the stabilization of memory traces after their initial acquisition^{14,15,16,17,18,19,20}.
Although some previous studies investigated the longterm retention of different perceptualmotor skills in humans using various tasks^{21,22,23,24}, only the study of Romano et al.^{25} examined the longterm stability of statistical learning, but even these findings were limited in validity. Although the retention of statistical memory was found after one year in a small sample of perceptualmotor skill experts and older nonexperts, the authors did not investigate whether statistical memory was resistant to new, interfering information. In fact, the general effect of interference on statistical learning has not been investigated within an extended time period. In this way, only the retention of memory traces could have been measured rather than their consolidation, which could not unravel the core processes underlying longterm memory formation. Therefore, the aim of the current study is twofold as follows. First, we explore the nature of those dynamic processes that underpin the longterm consolidation of statistical regularities by introducing interfering sequences in the course of learning. Second, we provide a valid replication of the study conducted by Romano et al.^{25} on a larger, homogeneous sample and show clear empirical evidence that statistical learning leads to persistent and immutable memory traces that are resistant to forgetting over a longer stretch of time. This combined approach enables to examine the adaptive nature of learning processes and the robustness of representations related to statistical regularities, since the change in performance measured on the previously practiced and the new, interfering sequence could be quantified.
In the current study, healthy young adults performed a statistical learning task and they were retested one year later without further practice between the two tests. Statistical learning was induced by a perceptualmotor fourchoice reaction time task that, unknown to the participant, included a temporal/serial regularity between nonadjacent trials. To test the susceptibility of the acquired statistical knowledge to interference, during the testing phases, before and after the oneyear delay, we changed the underlying statistical structure of the task for short periods. Thus, we administered the task with both the previously practiced statistical regularity and a new regularity that partially overlapped with the former one. This design enabled us to test not only the retention of the acquired statistical knowledge but more importantly, the resistance to interference before and after the oneyear delay period.
Material and Methods
Participants
Fortysix healthy young adults participated in the threesessionlong study and we collected data from all of them at each session. However, in the main text, retention of the acquired statistical knowledge (i.e., statistical memory) over the oneyear period was only assessed for those participants who exhibited significant statistical memory before the oneyear delay (see also ref. 26). By restricting the sample to these participants, we exclude the possibility of learning the statistical regularities only after the oneyear delay. Twentynine of the 46 participants met this criterion; therefore, in the main text, oneyear retention was tested in the final sample of 29 adults (mean age = 19.93 years, SD = 1.98 years; mean years of education = 13.36, SD = 1.72 years; 28 females). The criterion for showing significant statistical memory is specified in the Statistical Analysis section. In order to consider a possible sample selection bias that could have influenced our findings, we also present the results of the full sample (46 participants) in the Analysis of the full sample section of the Supplementary Material.
All participants in the final sample as well as in the full sample had normal or correctedtonormal vision and none of them reported a history of any neurological and/or psychiatric condition. All participants provided written informed consent before enrollment and received course credits for taking part in the experiment. The study was approved by the United Ethical Review Committee for Research in Psychology (EPKEB) in Hungary (Approval number: 30/2012) and by the research ethics committee of Eötvös Loránd University, Budapest, Hungary. The study was conducted in accordance with the Declaration of Helsinki.
Task
The Alternating Serial Reaction Time (ASRT) task was used to induce statistical learning^{18}. In this task, a stimulus (a dog’s head) appeared in one of four horizontally arranged empty circles on the screen^{27}. Participants were instructed to press a corresponding key (Z, C, B, or M on a QWERTY keyboard) as quickly and accurately as possible when the stimulus occurred. Unbeknownst to the participants, the presentation of stimuli followed an eightelement sequence, within which predetermined (P) and random (r) elements alternated with each other (e.g., 2 − r − 1 − r − 3 − r − 4 − r; where numbers denote the four locations on the screen from left to right, and r’s denote randomly chosen locations out of the four possible ones; see Fig. 1A). There were 24 permutations of the four possible spatial positions. However, because of the continuous presentation of the stimuli, for each participant, one of the six unique permutations of the four possible ASRT sequence variations was selected in a pseudorandom manner^{28, 29}. For details, please see the Description of sequences section of the Supplementary Material.
The alternating sequence in the ASRT task makes some runs of three consecutive elements (henceforth referred to as triplets) more frequent than others. In the example above, 2X1, 1X3, 3X4, and 4X2 (X indicates the middle element of the triplet) occurred often since the third elements could have either been a predefined or a random element (see Fig. 1B). At the same time, 1X2 and 4X3 occurred less frequently since the third element could have only been random. The former triplet types were labeled as “highprobability” triplets while the latter types were labeled as “lowprobability” triplets^{30}. The third element of a highprobability triplet was more predictable from the first element of the triplet than in the case of lowprobability triplets. For instance, in the example shown on Fig. 1B, Position 3 as the first element of a triplet is more likely (62.5%) to be followed by Position 4 as the third element, than either Position 1, 2, or 3 (12.5%, each). In accordance with this principle, each item was categorized as either the third element of a high or a lowprobability triplet, and the accuracy and reaction time (RT) of the response to this item were compared between the two categories.
This task allows us to separate pure statistical learning from general skill improvements. Statistical learning is defined as faster and more accurate responses to high conditional probability events compared to that to low conditional probability ones (Fig. 1B)^{18}. In contrast, general skill improvements refer to the speedup and changes in accuracy, which are independent of the conditional probabilities of the events. These improvements reflect more efficient visuomotor and motormotor coordination due to practice^{10}.
In our study, participants were unaware of the underlying conditional probability structure of the stimulus sequence, and they did not even know that they were in a learning situation. Thus an implicit, nonconscious form of learning was tested^{31, 32}. This has also been confirmed using a short questionnaire and the InclusionExclusion Task (see the Testing the implicitness of the acquired statistical knowledge section).
Procedure
One block of the ASRT task contained 85 trials (stimuli). In each block, the eightelement sequence repeated 10 times after five warmup trials consisting only of random stimuli. The ASRT task was administered in three sessions. During the Learning Phase, the task included nine epochs, each containing five blocks (45 blocks in total). Both the Testing and the Retesting Phase included only three epochs (i.e., a total of 15 blocks of stimuli in each session), and these sessions had identical structure (Fig. 1C).
The middle epoch (5 blocks) of both of these sessions (Epoch 11 and 14) served as interference. An interference sequence was defined as a previously unpracticed repeating sequence that was different from the one appearing in all other epochs. For instance, if the original sequence was 2 − r − 1 − r − 3 − r − 4, the interference sequence could be 2 − r − 1 − r − 4 − r − 3. Thus, there was partial overlap between the original sequence and the interference sequence. Twentyfive percent of the originally highprobability triplets remained highprobability in the interference sequence, while the remaining 75% became lowprobability. This means that of the 16 originally highprobability triplets, four remained unchanged (“highhigh” triplets: HH) and 12 highprobability triplets became lowprobability ones in the interference sequence (“highlow” triplets: HL). Of the 48 lowprobability triplets, 12 became highprobability ones in the interference sequence (“lowhigh” triplets: LH; replacing the 12 originally highprobability, HL, triplets) and the remaining 36 were lowprobability ones in both sequences (“lowlow” triplets: LL). Examples and frequency statistics for each triplet type are provided in the Description of sequences section of the Supplementary Material.
Participants were not told about the change in the underlying sequence during interference blocks. In addition, they were unaware of the fact that they were going to practice the same task with the same interfering sequence one year later.
Statistical Analyses
We followed the data analysis protocol established in previous studies using the ASRT task^{18, 27, 33} and collapsed the blocks of the task into epochs of five blocks. Therefore, the Learning Phase consisted of nine epochs, while the Testing and Retesting Phases consisted of three epochs. Epochs are labeled consecutively (1, 2, …, 15) in the remainder of paper (see Fig. 1C). Mean accuracy (ratio of correct responses) and median RT only for correct responses were determined for each participant and epoch, separately for high and lowprobability triplets. Learning scores in the Learning Phase and memory scores in the Testing and Retesting Phases were then calculated as the difference between triplet types in RT (RT for lowprobability triplets minus RT for highprobability triplets) and accuracy (accuracy for highprobability triplets minus accuracy for lowprobability triplets). Greater score in both measures indicates larger statistical learning/memory. To evaluate statistical learning and retention of the acquired statistical knowledge, we conducted repeated measures analyses of variance (ANOVAs) and pairedsamples ttests. GreenhouseGeisser epsilon (ε) correction was used when necessary. Original df values and corrected p values (if applicable) are reported together with partial etasquared (η_{ p } ^{2}) as the measure of effect size. Analyses and results concerning accuracy are only reported in the Analysis of accuracy data section of the Supplementary Material; here we focus on RT measures.
Only those participants were included in the final sample who showed significant statistical memory in the Testing Phase. This was evaluated blockwise in the following way. (1) We considered only those 10 blocks of the Testing Phase in which we presented the same repeating sequence to participants as in the Learning Phase. These blocks are henceforth referred to as noninterference blocks or epochs (the cluster of five blocks; Fig. 1C). (2) In the Testing Phase, median RT for correct responses was calculated for each participant, block, and triplet type. (3) Then we calculated the statistical memory score (difference in RTs for low vs. highprobability triplets) for each participant and each block. This yielded 10 memory scores per participant. (4) A onesample ttest against zero was run on these scores for each participant separately to confirm whether a participant showed any significant statistical memory. (5) If the mean of the 10 blocks deviated significantly from zero (in the positive direction), the given participant was included in the final sample. A deviation from zero was regarded as significant if the pvalue was less than 0.050. Twentynine participants met this criterion (mean score = 18.68 ms, SD = 7.96 ms).
As the focus of the current study is on the retention of statistical memory, we performed Bayesian pairedsamples ttests and calculated the Bayes Factor (BF) for the relevant comparisons (see the Results section below). The BF is a statistical technique that helps conclude whether the collected data favors the nullhypothesis (H _{0}) or the alternative hypothesis (H _{1}); thus, the BF could be considered as a weight of evidence provided by the data^{34}. It is an effective mathematical approach in consolidation studies where it is expected that the acquired evidence supports H _{0} rather than H _{1} ^{35,36,37}. In this case, H _{0} is the lack of difference between the two means, and H _{1} states that the two means of memory scores differ. BFs were calculated using the JASP (version 0.8, see refs 38 and 39). Here we report BF_{01} values. According to Wagenmakers et al.^{34}, BF_{01} values between 1 and 3 indicate anecdotal evidence for H _{0}, while values between 3 and 10 indicate substantial evidence for H _{0}. Conversely, while values between 1/3 and 1 indicate anecdotal evidence for H _{1}, values between 1/10 and 1/3 indicate substantial evidence for H _{1.} If the BF is below 1/10, 1/30, or 1/100, it indicates strong, very strong, or extreme evidence for H _{1}, respectively. Values around one do not support either H _{0} or H _{1}.
Results
The prerequisite of memory consolidation
Before memory consolidation can be assessed, significant statistical learning needs to occur preceding the long delay period, i.e., during the Learning and Testing Phases. As we report data of those participants who showed significant statistical memory in the Testing Phase, here in this analysis we only confirm that these participants – as a group – indeed exhibit significant statistical learning before the oneyear delay period. Statistical learning during the Learning Phase was tested with a twoway repeated measures ANOVA for RT with TRIPLET (high vs. lowprobability) and EPOCH (1–9) as withinsubjects factors. The ANOVA revealed significant statistical learning and general skill improvements (significant main effects of TRIPLET, F(1, 28) = 96.71, p < 0.001, η_{ p } ^{2} = 0.774, and EPOCH, F(8, 224) = 72.08, ε = 0.303, p < 0.001, η_{ p } ^{2} = 0.720). Participants were increasingly faster on high than on lowprobability triplets as the task progressed (TRIPLET*EPOCH interaction, F(8, 224) = 7.51, ε = 0.617, p < 0.001, η_{ p } ^{2} = 0.212; see Fig. 2).
Thus, there was evidence for both statistical learning and general skill improvements during the Learning Phase. Significant statistical learning and general skill improvements before the oneyear delay also took place during the Testing Phase, as this was the criterion for participants to be included in the final sample.
Resistance to forgetting
To test the oneyear retention of the learned statistical contingencies, we first checked whether there is any change in statistical memory performance between the noninterference epochs of the Testing and Retesting Phases (i.e., resistance to forgetting; see Fig. 1C). An ANOVA was conducted for RT with SESSION (Testing vs. Retesting), TRIPLET (high vs. lowprobability), and EPOCH (10, 12 vs. 13, 15) as withinsubjects factors. We found evidence for retained statistical memory after oneyear delay (nonsignificant SESSION*TRIPLET interaction, F(1, 28) = 0.08, p = 0.774, η_{ p } ^{2} = 0.003, BF_{01} = 4.873) with similar memory scores during Testing and Retesting Phases (see Fig. 3A and Table S1).
Irrespective of the retention of statistical memory, the same ANOVA revealed partially decreased general skills over the oneyear delay. Participants were significantly slower in the Retesting Phase compared to the Testing Phase (significant main effect of SESSION: F(1, 28) = 24.32, p < 0.001, η_{ p } ^{2} = 0.465, BF_{01} = 0.001; cf. Fig. 2 and Table S1). These results suggest that while statistical learning leads to persistent memory representations over the oneyear delay, some aspects of general skills undergo forgetting over this period.
The effect of the interference sequence
Testing Phase
The effect of interference sequence on statistical memory was evaluated in three steps. First, we tested whether significant learning of the changed statistical regularities occurred in the interference epoch. As the interference sequence was partly overlapping with the practiced sequence, some triplets in the interference epoch changed (from low to highprobability or vice versa) and other triplets remained the same as in the noninterference epochs. To determine the extent of learning these changed probabilities, we calculated the difference in RTs and accuracy (i.e., memory scores) between those triplets that remained lowprobability throughout the whole task (LL) and those ones that became highprobability in the interference epoch (LH). If participants have successfully adapted to the changed probabilities, we should find faster RTs/higher accuracy for LH triplets compared to the LL ones. Note, that the only difference between these triplets is that LH triplets became highprobability in the interference epochs, while the LL triplets did not, but otherwise both of them were lowprobability in the original, practiced sequence (occurred with a similar frequency in the noninterference epochs). Again, analyses and results concerning accuracy are only reported in the Supplementary Material.
Second, if statistical memory acquired during previous practice was robust against interference, performance on the noninterference epoch (Epoch 12) following the interference epoch (Epoch 11) should be comparable to that on the noninterference epoch (Epoch 10) preceding the interference epoch. To determine whether the acquired statistical knowledge was robust against interference, we calculated the difference in RTs and accuracy (i.e., memory scores) between those triplets that remained lowprobability throughout the whole task (LL) and those ones that were highprobability in the noninterference epochs but became lowprobability in the interference epochs (HL). This way, we excluded those triplets from this comparison that were highprobability in the interference epoch (HH and LH). Importantly, significant statistical learning on the interference sequence was not a prerequisite of testing the resistance of the original statistical knowledge against interference.
First, we found statistical learning on the interference epoch as the statistical memory score for RT (LL minus LH) significantly differed from zero (M = 5.40 ms, t(28) = 2.58, p = 0.015, BF_{01} = 0.316; see Fig. 3B). Second, we compared statistical memory performance in the two noninterference epochs of the Testing Phase, and found resistance to interference since there was no significant difference between Epoch 12 and Epoch 10 (M _{Epoch12} = 17.86 ms vs. M _{Epoch10} = 18.72 ms; t(28) = 0.33, p = 0.742, BF_{01} = 4.814; see Fig. 3B). In the case of the noninterference epochs, HL and LL triplets corresponded to high and lowprobability triplets, respectively.
In summary, the significant statistical learning during interference is evidence that participants acquired new statistical information in the Testing Phase as they became faster on those highprobability triplets that has originally been lowprobability ones (LH). In addition, we found efficient resistance to interference as before and after the interference, statistical memory performance remained similar on those triplets that participants rarely practiced during interference (HL).
Retesting Phase
To examine the effect of interference after one year has elapsed, we followed those two analysis steps described above in relation to the Testing Phase. First, we did not find statistical learning on the interference epoch as the statistical memory score for RT (LL minus LH) did not differ significantly from zero (M = −1.26 ms, t(28) = −0.54, p = 0.593, BF_{01} = 4.429; see Fig. 3B). Second, we compared statistical memory performance in the two noninterference epochs of the Retesting Phase, and found resistance to interference since there was no significant difference between Epoch 15 and Epoch 13 (M _{Epoch15} = 18.47 ms vs. M _{Epoch13} = 15.72 ms; t(28) = −0.76, p = 0.457, BF_{01} = 3.901; see Fig. 3B). Again, in the case of the noninterference epochs, HL and LL triplets corresponded to high and lowprobability triplets, respectively.
In summary, results suggest that participants did not acquire new statistical information on the interference sequence in the Retesting Phase as they responded with similar RTs to LH and LL triplets. At the same time, participants showed efficient resistance to interference as statistical memory performance remained similar on those triplets that participants rarely practiced during interference (HL).
Comparing the effect of interference sequence across Testing and Retesting Phases
First, in order to directly test whether the degree of learning new statistical information differed between the Testing and Retesting Phases, we compared the statistical memory scores between the interference epochs of the two sessions. The statistical memory score was significantly higher after 24 hours than after one year (5.40 ms vs. −1.26 ms, respectively, t(28) = 2.39, p = 0.024, BF_{01} = 0.452; see Fig. 3B).
Second, we examined whether the acquired statistical knowledge was resistant to interference across the Testing and Retesting Phases in a similar level. Particularly, we tested whether the difference in statistical memory scores between Epoch 12 and 10 and the difference in statistical memory scores between Epoch 15 and 13 were similar. Again, we compared performance on HL and LL triplets. We found evidence for the same level of resistance to interference in the Testing and Retesting Phases (M _{Epoch1210} = −0.86 ms vs. M _{Epoch1513} = 2.74 ms; t(28) = −0.72, p = 0.476, BF_{01} = 3.986; see Fig. 3B).
In sum, results suggest that statistical learning was weaker on the interference epoch in the Retesting Phase than in the Testing Phase. However, statistical knowledge was resistant to interference in a similar degree in the Testing and Retesting Phases.
Testing relearning the statistical regularities after oneyear delay
To rule out the possibility that the oneyear retention of statistical memory is due to relearning in the Retesting Phase, additional ANOVAs were run with SESSION (Learning vs. Retesting Phase), TRIPLET (high vs. lowprobability), and EPOCH (1 and 2 vs. 13 and 15) as withinsubjects factors (see Fig. 1C and Table S1). The significant SESSION*TRIPLET interaction (F(1, 28) = 25.34, p < 0.001, η_{ p } ^{2} = 0.475) showed larger statistical memory after the oneyear delay than at the beginning of the Learning Phase (M _{Epochs13,15} = 18.10 ms vs. M _{Epochs1,2} = 7.26, BF_{01} = 0.001). In sum, the learning measure confirms that participants did not relearn the task after the oneyear delay, which provides further evidence for the oneyear retention of statistical memory.
Testing the implicitness of the acquired statistical knowledge
To test whether participants gained explicit/conscious knowledge about the statistical regularities underlying the ASRT task, which could have influenced both learning and consolidation processes, first we administered a short questionnaire^{18} at the end of the Retesting Phase. This questionnaire included increasingly specific questions such as “Have you noticed anything special regarding the task? Have you noticed some regularity in the sequence of stimuli”? The experimenter rated subjects’ answers on a 5item scale, where 1 was “Nothing noticed” and 5 was “Total awareness”. None of the participants reported noticing any regularities in the task.
In addition, we administered the wellestablished “Process Dissociation Procedure”, PDP^{40}, using an InclusionExclusion task^{41,42,43,44,45}. First, we asked the participants to generate a sequence of key presses that follows the regularity of the ASRT task, using the same response buttons as the ones they used in the ASRT task. There were four runs of the task with this inclusion instruction, and each run was finished after 24 key presses were made. Second, participants were asked to generate new sequences of key presses by excluding all information they (might have) consciously gained about the regularity of the ASRT task and trying to press the response buttons according to a new regularity they had never practiced before. There were four runs of the task with this exclusion instruction, and each run was finished after 24 key presses were made.
According to the PDP, successful performance in the inclusion condition can be achieved by using solely one’s implicit knowledge, since participants are asked to do the exact same thing as they did during performing the main task that contained the statistical regularities. Explicit knowledge in this case can further improve performance but it is not necessary for being able to perform the task successfully. Thus, in the inclusion condition, both implicit/nonconscious and explicit/conscious knowledge can lead to successful performance.
In contrast, in the exclusion condition, implicit and explicit knowledge work in opposition since only the conscious knowledge of statistical regularities can make it possible for participants to generate a different sequence of key presses and hence perform the exclusion task successfully. Generation of (thus, failure to exclude) the learned statistical regularities in the exclusion task indicates reliance on one’s implicit knowledge, which cannot be controlled consciously.
To test whether participants gained conscious knowledge of the statistical regularities, we calculated the percentage of producing highprobability triplets in the inclusion and exclusion conditions separately. Then we compared these percentages to chance level, which is 25% in this case, because out of the 64 possible triplets that participants can generate, only 16 triplets are more predictable in any of the ASRT sequences. If participants generated more highprobability triplets in the inclusion task than it would have been expected by chance, it can indicate either implicit or explicit knowledge about the statistical regularities. In contrast, if participants generated more highprobability triplets in the exclusion task than it would have been expected by chance, it can only indicate implicit knowledge and lack of conscious control over their knowledge because they failed to exclude this knowledge.
Participants generated 7.76% more highprobability triplets in the inclusion task than it would have been expected by chance (t(28) = 5.37, p < 0.001, BF_{01} = 4.789 × 10^{−4}). More interestingly, they also generated more highprobability triplets in the exclusion task than it would have been expected by chance (6.19%; t(28) = 4.28, p < 0.001, BF_{01} = 0.007), showing that they lacked the conscious control and were unable to exclude the acquired statistical knowledge to successfully perform this part of the task. The above chance percentage of highprobability triplets did not differ between the inclusion and exclusion conditions (t(28) = 0.78, p = 0.442, BF_{01} = 3.834), suggesting that participants were relying on their implicit knowledge about the statistical regularities both in the inclusion and exclusion conditions.
In summary, these results confirm that participants did not gain explicit knowledge about the statistical regularities but used their implicit knowledge to perform the InclusionExclusion Task.
Discussion
In this study, we have shown clear evidence for the longterm consolidation of statistical memory in a carefully controlled experimental design, which involved interference manipulation. Moreover, we have highlighted how learning processes underlying statistical memory changed over a longer stretch of time. Statistical memory scores were similar after 24 hours and one year, irrespective of the type of learning measure (i.e., accuracy or RT, see Supplementary Material). Participants successfully acquired and stabilized the previously learned material, and after 24 hours, they learnt new, interfering statistical information. Statistical memory performance on the primarily practiced sequence was resistant to the interfering effect of adding a new sequence after 24 hours and one year in a comparable degree, which indicates that the acquired statistical knowledge remained persistent over time^{19}.
Previous studies have shown that some aspects of skill acquisition are based on probabilistic perception and probabilistic learning^{1, 11, 46, 47}. However, it has not been proven that statistical learning alone can lead to longlasting representations, because in other studies and observations, several confounding factors were present: For instance, practice after the initial acquisition of statistical regularities together with the intervention of higherorder cognitive processes (as a result of the person intending to learn the given skill) could lead to reactivation, reconsolidation, and substantial alteration of the original memory traces. Therefore, our study took five possible confounds of consolidation into account. First, we controlled for shortterm (i.e., 24hour) consolidation of the acquired knowledge of conditional probabilities by inserting a Testing Phase. Second, we used identical design in the Testing and the Retesting Phase by inserting an interference epoch in both sessions in order to test both resistance to forgetting and resistance to interference after one year. In addition, learning on the interference sequence and resistance to interference were directly tested by quantifying how successfully participants have adapted to the changed probabilities in the interference and noninterference epochs. Third, we ruled out the possibility of relearning by showing better performance after the oneyear period than at the very beginning of the Learning Phase. Fourth, there was no intervening practice during the oneyear period, minimizing the possibility of reactivation of the acquired statistical memory during this time window. Fifth, learning was implicit because participants were unaware of the learning situation, the statistical structure of the stimulus stream, as well as of the fact that they will be tested one year later, controlling for any confounding effects of explicit strategy use during memory encoding and consolidation. Moreover, our results are supported by Bayesian statistics besides general linear models (cf. Materials and methods). Although our conclusions are based on a restricted sample of participants with robust statistical memory before the oneyear delay, results of the full sample also indicate that the acquired statistical memory trace was resistant to interference as well as to forgetting after one year (see the Analysis of the full sample section of the Supplementary Material). With the latter analyses, we could ensure that the results are not influenced by a possible sample selection bias. Therefore, in regard to the applied rigorous methodology, results of the present study further extend the findings of Romano et al.^{25} about the longterm retention of statistical regularities.
The retention of statistical knowledge after the long delay extends the findings of Nemeth and Janacsek^{14} and Meier and Cock^{16}, who found comparable retention of sequential memory across 12hour, 24hour, and oneweek delay intervals. It is conceivable that those processes related specifically to the retention of statistical knowledge do not change already after 12hour delay (see also refs 29 and 48), which is also in line with our finding that the acquired statistical knowledge was equally robust to interference both after 24 hours and one year. Our results are also comparable to the findings of Arciuli and Simpson^{49} showing stable, consistent, and sleepindependent statistical learning over the medium term (30 min, 1, 2, 4, and 24hour) in a different statistical learning task (embedded triplet paradigm).
In our design, general skill improvements refer to general speedup, independent of the statistical structure of the task, reflecting more general learning mechanisms. Previous studies^{14, 16} found improved general skills both after 24 hours and one week compared to the end of the training session, but the degree of improvement did not differ between the two delay intervals. Moreover, retained general skills were also found after one year^{25}. In the present study, general skills were retained over the oneyear period measured by accuracy (see Resistance to forgetting section of the Supplementary Material) but were decreased measured by RT (i.e., slower overall RT). It is possible that the lack of practice on the ASRT task might have affected only the speed and not the precision of visuomotor coordination, which resulted in slower RTs after the oneyear delay. This finding suggests that some aspects of general skills undergo forgetting over one year if no further practice is intervened. However, overall accuracy and RT were decreased after one year as compared to the beginning of the first session (see the main effects of SESSION in Table S1), suggesting that the general skill was retained at least in some degree (cf. ref. 25). The latter evidence also corroborates our previous statement about the implausibility of relearning the ASRT task after the offline period. Nevertheless, future studies need to disentangle how these aspects of general skills consolidate over a longer stretch of time (cf. ref. 21). In the neuroscience of skill learning, a longstanding issue is that general skill learning mechanisms are heavily intertwined with statistical or sequencespecific learning, which hinders the possibility to draw conclusions about statistical learning itself. Following the protocol of previous studies^{50, 51}, here we separated pure statistical learning from these mechanisms and directly investigated the oneyear retention of pure statistical learning.
Our results suggesting that the representation of the original statistical structure is immune to interference and learning a similar but new statistical structure is more demanding extend the study of Gebhart et al.^{52}. In an auditory statistical learning task where two different statistical structures (artificial languages) determined the presented stimuli, Gebhart et al.^{52} showed that participants could learn only the first structure of speech streams if no explicit information was given about the change in structure during the task or the second structure was not presented for a longer duration. Accordingly, it is conceivable that more blocks of the interference sequence in our design could have increased the chance to relearn the interference sequence after one year elapsed, and the primacy effect (see also e.g., refs 53,54,55,56) of the first statistical pattern could be disrupted. We should note that the difference in learning scores in the interference epochs between 24 hours and one year was relatively small in the restricted sample, which might have changed if more chance to practice had been provided. In the full sample, the acquisition of new statistical information on the interference sequence after 24 hours was limited, and no difference was found in the learning scores of the interference epochs between 24 hours and one year. Therefore, we could not show robust evidence for the change in learning a new statistical structure over a long delay. Importantly, as the Gebhart et al.^{52} study showed, learning a new structure did not attenuate performance on the original structure, which was also the case in our study after the 24hour delay (i.e., resistance to interference). Nevertheless, it still remains unclear whether far more practice on the interference sequence could cause performance deterioration on the noninterference sequence, or the different representations of the two statistical structures could be maintained at the same time and individuals could flexibly switch between them.
An advantage of having a more stable or elaborated primary structure is that the underlying cognitive/perceptual mechanisms remain sensitive to this structure later, even if the organism has to learn other statistical regularities in the meantime^{52}. Meanwhile, it has been shown that stable initial memory representations disabled learning transfer between different memory tasks^{57}. The longterm impact of the primarily acquired statistical structure and its predictive power have also been demonstrated in the perception of the auditory environment^{58, 59} and in processing native and nonnative phonetic features of word stress^{60} as indicated by eventrelated brain potentials (e.g., the mismatch negativity). In line with the previous studies, our results suggest the more influential role of the primarily learnt statistical structure.
Results of the present study are also compatible with the strong impression coming from daily experience that skills, such as speaking a language or playing tennis, once acquired, are persistent throughout life. Moreover, by giving insight to the dynamic change of underlying learning processes, we could provide an experimentally wellcontrolled design and a possible explanatory framework for other studies investigating the longterm retention of statistical structures embedded in other perceptual/cognitive domains under more natural circumstances. For instance, on a small sample of participants, Frank et al.^{12} found retention of largescale artificial languages even after three years, although participants were only exposed to these languages for 10 days without directly paying attention to the presented chunks of languages. The authors claimed this was evidence that statistical learning skills related to speech segmentation could be applied to the lexicons of natural languages. A simple paradigm such as the ASRT task might be used over an even longer time period to test the upper bound of the retention of statistical knowledge, and to obtain a clearer insight to the characteristics of processes determining consolidation in such a largescale as language acquisition (see also ref. 61).
Taken together, the present study shows that probabilistic mechanisms are not only present in perception and learning but also that their results remain stable over longer periods of time. Specifically, we demonstrated that statistical knowledge was resistant to interference and also to forgetting after one year. Our experimental design also enabled to test how the neurocognitive processes underlying statistical learning changed over this time period. In the long run, these results can help build a better computational framework^{46} of systemslevel brain mechanisms that underlie learning and memory.
References
 1.
Orbán, G., Fiser, J., Aslin, R. N. & Lengyel, M. Bayesian learning of visual chunks by human observers. Proc Natl Acad Sci 105, 2745–2750, doi:10.1073/pnas.0708424105 (2008).
 2.
Fiser, J. & Aslin, R. N. Statistical learning of new visual feature combinations by infants. Proc Natl Acad Sci 99, 15822–15826, doi:10.1073/pnas.232472899 (2002).
 3.
Winkler, I., Denham, S. L. & Nelken, I. Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn Sci 13, 532–540, doi:10.1016/j.tics.2009.09.003 (2009).
 4.
Yang, Z. & Purves, D. A statistical explanation of visual space. Nat Neurosci 6, 632–640, doi:10.1038/nn1059 (2003).
 5.
Teinonen, T., Fellman, V., Näätänen, R., Alku, P. & Huotilainen, M. Statistical language learning in neonates revealed by eventrelated brain potentials. BMC Neurosci 10, 1–8, doi:10.1186/147122021021 (2009).
 6.
TurkBrowne, N. B., Scholl, B. J., Johnson, M. K. & Chun, M. M. Implicit perceptual anticipation triggered by statistical learning. J Neurosci 30, 11177–11187, doi:10.1523/jneurosci.085810.2010 (2010).
 7.
Bar, M. The proactive brain: using analogies and associations to generate predictions. Trends Cogn Sci 11, 280–289, doi:10.1016/j.tics.2007.05.005 (2007).
 8.
Kaufman, S. B. et al. Implicit learning as an ability. Cognition 116, 321–340, doi:10.1016/j.cognition.2010.05.011 (2010).
 9.
Ullman, M. T. Contributions of memory circuits to language: the declarative/procedural model. Cognition 92, 231–270, doi:10.1016/j.cognition.2003.10.008 (2004).
 10.
Hallgato, E., GyoriDani, D., Pekar, J., Janacsek, K. & Nemeth, D. The differential consolidation of perceptual and motor learning in skill acquisition. Cortex 49, 1073–1081, doi:10.1016/j.cortex.2012.01.002 (2013).
 11.
Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical Learning by 8MonthOld Infants. Science 274, 1926–1928, doi:10.1126/science.274.5294.1926 (1996).
 12.
Frank, M. C., Tenenbaum, J. B. & Gibson, E. Learning and LongTerm Retention of LargeScale Artificial Languages. PLoS One 8, e52500, doi:10.1371/journal.pone.0052500 (2013).
 13.
Genzel, L. & Robertson, E. M. To Replay, Perchance to Consolidate. PLoS Biol 13, e1002285, doi:10.1371/journal.pbio.1002285 (2015).
 14.
Nemeth, D. & Janacsek, K. The dynamics of implicit skill consolidation in young and elderly adults. J Gerontol B Psychol Sci Soc Sci 66, 15–22, doi:10.1093/geronb/gbq063 (2011).
 15.
Robertson, E. M. From Creation to Consolidation: A Novel Framework for Memory Processing. PLoS Biol 7, e1000019, doi:10.1371/journal.pbio.1000019 (2009).
 16.
Meier, B. & Cock, J. Offline consolidation in implicit sequence learning. Cortex 57, 156–166, doi:10.1016/j.cortex.2014.03.009 (2014).
 17.
Krakauer, J. W. & Shadmehr, R. Consolidation of motor memory. Trends Neurosci 29, 58–64, doi:10.1016/j.tins.2005.10.003 (2006).
 18.
Song, S., Howard, J. H. Jr. & Howard, D. V. Sleep does not benefit probabilistic motor sequence learning. J Neurosci 27, 12475–12483, doi:10.1523/jneurosci.206207.2007 (2007).
 19.
Robertson, E. M., PascualLeone, A. & Miall, R. C. Current concepts in procedural consolidation. Nat Rev Neurosci 5, 576–582, doi:10.1038/nrn1426 (2004).
 20.
Robertson, E. M. New Insights in Human Memory Interference and Consolidation. Curr Biol 22, R66–R71, doi:10.1016/j.cub.2011.11.051 (2012).
 21.
Hikosaka, O. et al. Longterm retention of motor skill in macaque monkeys and humans. Exp Brain Res 147, 494–504, doi:10.1007/s0022100212587 (2002).
 22.
Willingham, D. B. & Dumas, J. A. Longterm retention of a motor skill: Implicit sequence knowledge is not retained after a oneyear delay. Psychol Res 60, 113–119, doi:10.1007/BF00419684 (1997).
 23.
Ammons, R. B. et al. Longterm retention of perceptualmotor skills. J Exp Psychol 55, 318–328, doi:10.1037/h0041893 (1958).
 24.
Fleishman, E. A. & Parker, J. F. Jr. Factors in the retention and relearning of perceptualmotor skill. J Exp Psychol 64, 215–226, doi:10.1037/h0041220 (1962).
 25.
Romano, J. C., Howard, J. H. Jr. & Howard, D. V. Oneyear retention of general and sequencespecific skills in a probabilistic, serial reaction time task. Memory 18, 427–441, doi:10.1080/09658211003742680 (2010).
 26.
Albouy, G. et al. Both the hippocampus and striatum are involved in consolidation of motor sequence memory. Neuron 58, 261–272, doi:10.1016/j.neuron.2008.02.008 (2008).
 27.
Nemeth, D., Janacsek, K., Polner, B. & Kovacs, Z. A. Boosting human learning by hypnosis. Cereb Cortex 23, 801–805, doi:10.1093/cercor/bhs068 (2013).
 28.
Howard, J. H. Jr. & Howard, D. V. Age differences in implicit learning of higher order dependencies in serial patterns. Psychol Aging 12, 634–656, doi:10.1037/08827974.12.4.634 (1997).
 29.
Nemeth, D. et al. Sleep has no critical role in implicit motor sequence learning in young and old adults. Exp Brain Res 201, 351–358, doi:10.1007/s002210092024x (2010).
 30.
Nemeth, D., Janacsek, K. & Fiser, J. Agedependent and coordinated shift in performance between implicit and explicit skill learning. Front Comput Neurosci 7, 147, doi:10.3389/fncom.2013.00147 (2013).
 31.
Reber, A. S. Implicit learning and tacit knowledge. J Exp Psychol Gen 118, 219–235, doi:10.1037/00963445.118.3.219 (1989).
 32.
Cleeremans, A. & Dienes, Z. In The Cambridge Handbook of Computational Modeling (ed. R. Sun) 396–421 (Cambridge University Press, 2008).
 33.
Virag, M. et al. Competition between frontal lobe functions and implicit sequence learning: evidence from the longterm effects of alcohol. Exp Brain Res 233, 2081–2089, doi:10.1007/s0022101542798 (2015).
 34.
Wagenmakers, E. J., Wetzels, R., Borsboom, D. & van der Maas, H. L. Why psychologists must change the way they analyze their data: the case of psi: comment on Bem (2011). J Pers Soc Psychol 100, 426–432, doi:10.1037/a0022790 (2011).
 35.
Wagenmakers, E. J. A practical solution to the pervasive problems of p values. Psychon Bull Rev 14, 779–804, doi:10.3758/BF03194105 (2007).
 36.
Dienes, Z. Using Bayes to get the most out of nonsignificant results. Front Psychol 5, 10.3389/fpsyg.2014.00781 (2014).
 37.
Dienes, Z. Bayesian Versus Orthodox Statistics: Which Side Are You On? Perspect Psychol Sci 6, 274–290, doi:10.1177/1745691611406920 (2011).
 38.
JASP Team JASP (Version 0.8.0.0) [Computer software]. URL https://jaspstats.org/ (2016).
 39.
Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D. & Iverson, G. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon Bull Rev 16, 225–237, doi:10.3758/pbr.16.2.225 (2009).
 40.
Jacoby, L. L. A process dissociation framework: Separating automatic from intentional uses of memory. J Mem Lang 30, 513–541, doi:10.1016/0749596X(91)90025F (1991).
 41.
Destrebecqz, A. & Cleeremans, A. Can sequence learning be implicit? New evidence with the process dissociation procedure. Psychon Bull Rev 8, 343–350, doi:10.3758/BF03196171 (2001).
 42.
Destrebecqz, A. et al. The neural correlates of implicit and explicit sequence learning: Interacting networks revealed by the process dissociation procedure. Learn Memory 12, 480–490, doi:10.1101/lm.95605 (2005).
 43.
Jimenez, L., Vaquero, J. M. & Lupianez, J. Qualitative differences between implicit and explicit sequence learning. J Exp Psychol Learn Mem Cogn 32, 475–490, doi:10.1037/02787393.32.3.475 (2006).
 44.
Fu, Q., Dienes, Z. & Fu, X. Can unconscious knowledge allow control in sequence learning? Conscious Cogn 19, 462–474, doi:10.1016/j.concog.2009.10.001 (2010).
 45.
Fu, Q., Dienes, Z. & Fu, X. The distinction between intuition and guessing in the SRT task generation: a reply to Norman and Price. Conscious Cogn 19, 478–480, doi:10.1016/j.concog.2009.12.006 (2010).
 46.
Fiser, J., Berkes, P., Orban, G. & Lengyel, M. Statistically optimal perception and learning: from behavior to neural representations. Trends Cogn Sci 14, 119–130, doi:10.1016/j.tics.2010.01.003 (2010).
 47.
Fiser, J. & Aslin, R. N. Unsupervised statistical learning of higherorder spatial structures from visual scenes. Psychol Sci 12, 499–504, doi:10.1111/14679280.00392 (2001).
 48.
Press, D. Z., Casement, M. D., PascualLeone, A. & Robertson, E. M. The time course of offline motor sequence learning. Cognitive Brain Res 25, 375–378, doi:10.1016/j.cogbrainres.2005.05.010 (2005).
 49.
Arciuli, J. & Simpson, I. C. Statistical learning is lasting and consistent over time. Neurosci Lett 517, 133–135, doi:10.1016/j.neulet.2012.04.045 (2012).
 50.
Hunt, R. H. & Aslin, R. N. Statistical learning in a serial reaction time task: access to separable statistical cues by individual learners. J Exp Psychol Gen 130, 658–680, doi:10.1037/00963445.130.4.658 (2001).
 51.
Howard, D. V. et al. Implicit sequence learning: effects of level of structure, adult age, and extended practice. Psychol Aging 19, 79–92, doi:10.1037/08827974.19.1.79 (2004).
 52.
Gebhart, A. L., Aslin, R. N. & Newport, E. L. Changing Structures in Midstream: Learning Along the Statistical Garden Path. Cogn Sci 33, 1087–1116, doi:10.1111/j.15516709.2009.01041.x (2009).
 53.
Junge, J. A., Scholl, B. J. & Chun, M. M. How is spatial context learning integrated over signal versus noise? A primacy effect in contextual cueing. Vis Cogn 15, 1–11, doi:10.1080/13506280600859706 (2007).
 54.
da Estrela, C. & ByersHeinlein, K. VoisTu Le Kem? Do You See the Bos? Foreign Word Learning at 14 Months. Infancy 21, 505–521, doi:10.1111/infa.12126 (2015).
 55.
Yu, R. Q. & Zhao, J. The persistence of the attentional bias to regularities in a changing environment. Atten Percept Psychophys 77, 2217–2228, doi:10.3758/s1341401509305 (2015).
 56.
Billig, A. J. & Carlyon, R. P. Automaticity and primacy of auditory streaming: Concurrent subjective and objective measures. J Exp Psychol Human 42, 339–353, doi:10.1037/xhp0000146 (2016).
 57.
Mosha, N. & Robertson, E. M. Unstable Memories Create a HighLevel Representation that Enables Learning Transfer. Curr Biol 26, 100–105, doi:10.1016/j.cub.2015.11.035 (2016).
 58.
Mullens, D. et al. Altering the primacy bias—How does a prior task affect mismatch negativity? Psychophysiology 51, 437–445, doi:10.1111/psyp.12190 (2014).
 59.
Todd, J., Provost, A. & Cooper, G. Lasting first impressions: A conservative bias in automatic filters of the acoustic environment. Neuropsychologia 49, 3399–3405, doi:10.1016/j.neuropsychologia.2011.08.016 (2011).
 60.
Honbolygó, F. & Csépe, V. Saliency or template? ERP evidence for longterm representation of word stress. Int J Psychophysiol 87, 165–172, doi:10.1016/j.ijpsycho.2012.12.005 (2013).
 61.
MorganShort, K., Finger, I., Grey, S. & Ullman, M. T. Second Language Processing Shows Increased NativeLike Neural Responses after Months of No Exposure. PLoS One 7, e32974, doi:10.1371/journal.pone.0032974 (2012).
Acknowledgements
This research was supported by the Research and Technology Innovation Fund, Hungarian Brain Research Program (KTIA NAP 13220150002); Hungarian Scientific Research Fund (OTKA NF 105878); Postdoctoral Fellowship of the Hungarian Academy of Sciences (to A.K.); and Janos Bolyai Research Fellowship of the Hungarian Academy of Sciences (to K.J.). We would like to thank István Winkler for his comments on the previous version of the manuscript.
Author information
Affiliations
Contributions
A.K. anayzed data and wrote the manusript; K.J. designed the study, supervised data acquisition, analyzed data, and wrote the manuscript; Á.T. analyzed data and wrote the manuscript; D.N. designed the study, supervised data acquisition, wrote the manuscript, and provided intellectual and financial support. All authors contributed equally to the work and jointly supervised the work.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kóbor, A., Janacsek, K., Takács, Á. et al. Statistical learning leads to persistent memory: Evidence for oneyear consolidation. Sci Rep 7, 760 (2017). https://doi.org/10.1038/s41598017008073
Received:
Accepted:
Published:
Further reading

Predictability Changes What We Remember in Familiar Temporal Contexts
Journal of Cognitive Neuroscience (2020)

When less is more: Enhanced statistical learning of nonadjacent dependencies after disruption of bilateral DLPFC
Journal of Memory and Language (2020)

Input Complexity Affects LongTerm Retention of Statistically Learned Regularities in an Artificial Language Learning Task
Frontiers in Human Neuroscience (2019)

Is there more room to improve? The lifespan trajectory of procedural learning and its relationship to the between and withingroup differences in average response times
PLOS ONE (2019)

Deconstructing Procedural Memory: Different Learning Trajectories and Consolidation of Sequence and Statistical Learning
Frontiers in Psychology (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.