Inhibitory control hinders habit change

Our habits constantly influence the environment, often in negative ways that amplify global environmental and health risks. Hence, change is urgent. To facilitate habit change, inhibiting unwanted behaviors appears to be a natural human reaction. Here, we use a novel experimental design to test how inhibitory control affects two key components of changing (rewiring) habit-like behaviors in healthy humans: the acquisition of new habit-like behavior and the simultaneous unlearning of an old one. We found that, while the new behavior was acquired, the old behavior persisted and coexisted with the new. Critically, inhibition hindered both overcoming the old behavior and establishing the new one. Our findings highlight that suppressing unwanted behaviors is not only ineffective but may even further strengthen them. Meanwhile, actively engaging in a preferred behavior appears indispensable for its successful acquisition. Our design could be used to uncover how new approaches affect the cognitive basis of changing habit-like behaviors.

Initial acquisition and subsequent unlearning of associations that were no longer relevant due to the structural change in the task. Learning successfully occurred in the Learning phase (Fig. 3a, circled area): participants showed increasingly higher learning scores ('LL minus HL' , underlined letters indicating the triplet probabilities of the current comparison), reflecting faster responses to trials that were high-probability in Sequence A compared to low-probability ones (for raw RTs see Fig. S1a). This old knowledge was then partially unlearned during the Rewiring phase (Fig. 3a, non-circled area), in which originally high-probability trials became less probable ('LL minus HL'; thus, both trial types compared were low-probability in Sequence B). The different time course of (un)learning across the two phases is indicated by the significant Phase × Period interaction (F(2, 60) = 5.70, p = 0.005, η p 2 = 0.160). Specifically, participants gradually acquired the associations of Sequence A (Period 1 vs. Period 3: p = 0.002, Cohen's d = 0.60, BF 01 = 0.059), with learning scores differing signifi- During the Learning phase, participants extensively practiced a four-choice visuomotor reaction time task over 3600 trials, divided into three periods. In this task, a stimulus appeared in one of four horizontally arranged circles on the screen, and participants were asked to respond as quickly and accurately as they could using a response box. The associations of Sequence A (referred to as old knowledge) were acquired in this phase. Then during the Rewiring phase, a structural change was introduced to the task with Sequence B to prompt the rewiring of the old knowledge by acquiring the associations of this new sequence (referred to as new knowledge). Additionally, to engage participants' inhibitory control processes in this phase, they were asked to suppress their responses on some trials (stimuli underlined with a red line during the task, No-go trials), but could respond on other (Go) trials. This phase also consisted of 3600 trials, divided into three periods. In the Testing phase, using a shorter version of the task, knowledge of both sequences was probed in a counterbalanced order (ABAB or BABA on the figure, where A and B refer to the sequence used in the Learning and Rewiring phases, respectively). Here, responses were allowed on all trials, including previously suppressed No-go trials, to assess the effect of inhibitory control on rewiring. The stimulus was taken from the public domain (retrieved on 26/09/2017 from: www. pixab ay. com). Acquisition of new knowledge after structural change in the task. Learning of the new sequence occurred in the Rewiring phase (Fig. 3b, circled area): participants showed increasingly higher learning scores ('LL minus LH'), indicating faster responses to trials that were high-probability in Sequence B compared to lowprobability ones (for raw RTs see Fig. S1a). Note that these associations were all low-probability in Sequence A; therefore, no learning was expected for them in the Learning phase ('LL minus LH'). The different time course of learning across the two phases was revealed by the significant Phase × Period interaction (F(2, 60) = 3.89, p = 0.026, η p 2 = 0.115). Specifically, performance did not change significantly during the Learning phase (pairwise comparisons of periods: all ps ≥ 0.282, Cohen's ds ≤ 0.20, BF 01 s ≥ 3.019) and did not differ significantly from zero (all ps > 0.339, Cohen's ds ≤ 0.17, BF 01 s ≥ 3.390). In the Rewiring phase, learning scores increased from Period 1 to Period 3 (p = 0.019, Cohen's d = 0.44, BF 01 = 0.390) and became greater than zero by the end of the task (Period 3: p = 0.026, Cohen's d = 0.42, BF 01 = 0.503). The main effects were not significant (Phase: F(1, 30) = 1.60, p = 0.216, η p 2 = 0.051; Period: F(2, 60) = 0.20, p = 0.820, η p 2 = 0.007). In summary, these results confirm that participants acquired the associations of the new sequence after the structural change in the task. For a overall, pairs of six unique sequences were used in a counterbalanced order. Due to the alternating sequence structure, some runs of three consecutive trials were more probable than others (referred to as high-vs. low-probability triplets, respectively) 29 . An example of the difference between Sequence A and Sequence B used in the Learning and Rewiring phases, respectively, is shown by the underlined numbers. Due to this structural change in the task, the probability of some triplets changed from the Learning phase to the Rewiring phase: 75% of the initially high-probability triplets became low-probability (HL trials; thus, the first letter refers to the triplet probability in Sequence A, while the second letter refers to the probability of the same triplet in Sequence B) and were replaced by new high-probability triplets that were initially low-probability (LH trials). Additionally, the occurrence probability of some triplets remained constant: either being low-probability (LL trials) or high-probability (HH trials) in both phases (for further details see "Methods" section). (b) Learning scores were calculated as differences in response times to trials with changed (LH or HL) versus unchanged occurrence probabilities (LL or HH). This enabled us to assess how participants initially acquired the associations of Sequence A, and then updated their knowledge when practicing Sequence B. For example, we expected similarly slow responses to LH and LL trials in the Learning phase (as both were low-probability) but then faster responses to LH than LL in the Rewiring phase, indicating the acquisition of the more probable associations of Sequence B in this phase. Please note that all HH trials were Go during the Rewiring phase (for further details, see the "Supplementary methods" section in Supplementary  Information). Consequently, learning scores involving LL trials were the primary measures of interest as these could be used to assess the effect of inhibitory control on rewiring (by contrasting learning scores calculated on those trials that were Go vs. No-go in the Testing phase). How did the inhibition of responses during rewiring affect the old knowledge? In the Testing phase, we probed whether the old knowledge (using the 'LL minus HL' learning score) was expressed both in the old testing context (when the order of stimulus presentation followed Sequence A) and the new one (when stimulus presentation followed Sequence B; see also Fig. 1 for design). Knowledge on the previously Go and No-go trials was contrasted in both testing contexts. As expected, learning scores were significantly higher when tested on Sequence A than on Sequence B (main effect of Sequence: F(1, 30) = 10.11, p = 0.003, η p 2 = 0.252), regardless of the Go/No-go manipulation. At the same time, they were significantly above zero in both contexts, indicating that the old knowledge was expressed not only in its original context (Sequence A; 'LL minus HL' , p < 0.001, Cohen's d = 1.22, BF 01 = 1.845 e−5 ) but also in the new one (Sequence B; 'LL minus HL' , p < 0.001, Cohen's d = 0.71, BF 01 = 0.147), where it was no longer relevant.
Crucially, the magnitude of learning scores depended both on the testing context (Sequence A vs. B) and whether responses were inhibited during rewiring (Go vs. No-go trials), as indicated by the significant Sequence × Inhibition interaction (F(1, 30) = 11.81, p = 0.002, η p 2 = 0.282). When tested on Sequence A (Fig. 4a, circled area), learning scores were significantly above zero on Go and No-go trials (p = 0.001, Cohen's d = 0.63, BF 01 = 0.042; p < 0.001, Cohen's d = 1.25, BF 01 = 6.508 e−6 , respectively) and somewhat greater for the latter (p = 0.018, Cohen's d = 0.45, BF 01 = 0.360). This suggests that, instead of facilitating the unlearning process, inhibition potentially strengthened the expression of old knowledge in the old context. When tested on Sequence B (Fig. 4a, noncircled area), learning scores did not differ significantly on Go and No-go trials (p = 0.500, Cohen's d = 0.12, BF 01 = 4.210). Importantly, participants performed significantly above zero on both (Go trials: p = 0.004, Cohen's d = 0.55, BF 01 = 0.112; No-go trials: p = 0.025, Cohen's d = 0.42, BF 01 = 0.486), again indicating that old knowledge was expressed even when it was not relevant, irrespective of whether responses were inhibited during rewiring. www.nature.com/scientificreports/ From another perspective, learning scores on Go trials did not differ significantly across testing contexts (p = 0.735, Cohen's d = 0.06, BF 01 = 4.945). In contrast, learning scores on No-go trials were significantly higher when tested on Sequence A than Sequence B (p < 0.001, Cohen's d = 0.83, BF 01 = 0.003), suggesting that the detrimental effect of inhibition (boosting, instead of decreasing old knowledge) was greater in the old context than the new one. The main effect of Inhibition was not significant (F(1, 30) = 0.83, p = 0.37, η p 2 = 0.027). Overall, these results highlight the persistence of old knowledge across testing contexts and suggest a detrimental effect of the inhibition of responses during rewiring.  When tested on Sequence A (the original, old context), participants showed significant above-zero performance on Go and No-go trials, with significantly higher learning scores for the latter. This suggests that the old knowledge was present, and inhibiting responses during rewiring strengthened, instead of facilitated, its unlearning. When tested on Sequence B (the new context), participants exhibited similar, significantly above-zero learning scores on Go and No-go trials, suggesting that old knowledge was expressed even when it was not relevant, irrespective of whether responses were inhibited during rewiring. (b) New knowledge. When tested on Sequence B (the relevant, new context), participants showed significant above-zero learning scores only on Go trials and these learning scores differed significantly from those on No-go trials, indicating that new knowledge could be expressed only if responses were allowed to the relevant stimuli during rewiring. Thus, actively engaging in the new behaviorto-be-learned seemed essential for acquiring (and subsequently accessing) the new knowledge. When tested on Sequence A, participants' learning scores did not differ significantly from zero either on Go or No-go trials. This was expected since contrasted trials were all low-probability in Sequence A ('LL minus LH'). Error bars represent SEM. Overall, these findings indicate that participants successfully acquired the new knowledge on Go trials (for which responses were allowed during rewiring) and could express it in the appropriate context (i.e., when tested on Sequence B). At the same time, poorer performance on No-go trials suggests that actively engaging in the new behavior-to-be-learned may be essential for acquiring new associations and, consequently, for habit change.

Discussion
Changing habits is challenging 3 , but as threats of environmental and health disasters rapidly increase across the world 1,2 , it is more important than ever to find effective ways to succeed. To do so, it is vital that we gain a thorough understanding of how habits form and change. Previous research has extensively focused on nonhuman animals, reward-related behaviors, and clinical populations in humans, and characterized how simple stimulus-response(-reward) associations contribute to habit formation and change 16,17 . However, it is poorly understood how habit change occurs in healthy humans when more complex associations (i.e., when not only the current stimulus influences the response but a sequence of preceding stimuli) are learned and modified without explicit rewards. These features more closely resemble how habits form and change in daily life. Therefore, by probing how healthy human adults can form and rewire complex associations without explicit rewards, the present study can significantly contribute to our understanding of the key cognitive processes involved in habit change.
Using these features, we created a novel experimental design to test a widely held belief that inhibitory control could promote habit change 19,20 . In this design, we could test the acquisition of new habit-like behaviors and the simultaneous unlearning of old ones, and how inhibitory control affected both. Crucially, following the rewiring process, we probed both the old and new knowledge across original (old) and new testing contexts, and on those trials in which responses were or were not allowed previously, to reveal how inhibitory control affected the entire process of rewiring. We found that inhibiting responses had a detrimental effect on overcoming the old knowledge and establishing the new: old knowledge was retained and expressed not only in its original context but also in the new one; moreover, components of knowledge that were previously inhibited appeared to be even strengthened in the old context (Fig. 4a). New knowledge was expressed only in the new context and for those components to which responses were allowed (Fig. 4b), suggesting that actively engaging in the behavior-to-belearned may be indispensable for successfully changing habit-like behaviors.
Our findings revealed the persistence of old knowledge in both the old and new contexts, irrespective of whether components were inhibited during rewiring. Recently, a new line of research on the competition between habitual and goal-directed responses following changes in stimulus-outcome 6 or stimulus-response 7 associations has revealed a similar persistence effect. Specifically, following extended training and under time pressure-shown to favor the expression of habit-like behaviors-reaction times increased for the goal-directed (desired) responses and participants committed a large proportion of habitual (undesired) errors. These findings highlight that habitual ("old") and goal-directed ("new") associations are in conflict during response selection, and, together with the present study, suggest that undesirable habit-like behaviors may exert their influence even if the desired behavior is ultimately executed (see previously not inhibited components of new knowledge exhibited successfully in their corresponding [new] context).
Inhibiting responses during rewiring shows some similarities with extinction learning, whereby the wellestablished, habit-like behavior (response) fades over time as the previously conditioned stimulus is repeatedly presented without any reinforcer 14,27,30,31 . Following extinction, relapse-reoccurrence of the extinct behavior/ response-is often observed 17,32 . Our findings in the Testing phase show that relapse can occur not only when human participants encounter the original context e.g. 33,34 (akin to extinction learning studies) but also in the new context. This suggests that inhibiting unwanted behavior in everyday situations is ineffective in changing habits e.g. 35 . Importantly, as opposed to the typical settings in extinction studies, our results were observed without any explicit rewards being involved in either learning or rewiring, and alternative associations could be learned to replace the old ones (instead of just unlearning them). The persistence of old knowledge despite these characteristics suggests that extinction studies may underestimate the effect of suppressing old behaviors in habit change.
Our findings also suggest that inhibiting responses may even further strengthen cognitive representations underlying the original behavior we want to replace, resulting in a rebound effect. This is based on participants exhibiting higher learning scores on the previously inhibited components of old knowledge ('LL minus HL' , No-go) compared to those that were not inhibited ('LL minus HL' , Go), when tested in the old context (Sequence A). Note, however, that the effect size for this finding was slightly smaller (Cohen's d = 0.45) than the one used in the a priori calculations (a Cohen's d of 0.50; see the "Estimation of required sample size" section in the Supplementary Information) and, consequently, the post-hoc power appeared somewhat lower than expected (power = 0.68 for two-tailed comparisons, instead of the expected 0.80). Therefore, future studies are needed to replicate this rebound effect 6,7 . Beyond the persistence of old knowledge, our design could also reveal that www.nature.com/scientificreports/ old and new knowledge coexisted in the new context (at least for those trials in which responses were allowed during rewiring). We observed this effect both in reaction time (RT) (reported in the main text) and response accuracy measures (see Supplementary Information). This finding could explain the competition that could occur between old and new behaviors during habit change, and thus serve as the cognitive basis for such competition 27 . To translate these findings to a real-life example, let us suppose that Mary has just moved to Country B. Here, recycling is much more prevalent than her previous residence in Country A, and she has therefore had to start dividing household waste into different bins depending on its material. In this case, the old behavior (throwing all household waste into the same bin) is expected to be gradually unlearned and replaced by the new behavior (dividing waste into separate bins). Despite the decision to change her behavior, it is possible that (i) when Mary re-visits Country A (old context) she reverts to not recycling (relapse of the old behavior), and (ii) even in Country B (new context), she might divide waste on some occasions but not on others (coexistence of old and new behaviors). Furthermore, Mary may, consciously or unconsciously, suppress some aspects of her habitual behavior of not dividing waste, which could exacerbate the above-described behavioral pattern. Since old and new behaviors coexist, and a continuous inhibition of the old behavior may be unsustainable over longer periods, our findings highlight that interventions using other approaches for habit change must be tested (for further discussion see 18,36,37 ). One might argue that our results are driven by an incomplete acquisition of the new knowledge as suggested by data from the Rewiring phase (see also the "How does acquisition of new knowledge compare with the initial learning process?" section in the Supplementary Information). However, some aspects of performance in the Testing phase suggest otherwise. Specifically, direct comparisons of old and new knowledge indicate that, of those trials on which responses were allowed during rewiring, participants could express old and new knowledge at a similar level, both when compared in their respective contexts (i.e., in Sequence A vs. Sequence B, respectively), as well as in the new (Sequence B) context (see "Is the level of the new knowledge comparable to that of the old knowledge in the Testing phase?" section in the Supplementary information). Since a 24-h delay period was included between the Rewiring and the Testing phases in our design, it is likely that consolidation (i.e., stabilization) of memory traces occurred in this period 23 , facilitating the expression of newly acquired knowledge in the Testing phase. Future research should test how rewiring schedules with different durations of practice and different lengths of consolidation periods in-between 38,39 affect old and new knowledge across testing contexts.
In our experimental design, the duration of training for rewiring and the acquisition of old knowledge was the same. Recent studies showed that while we can acquire associative knowledge relatively quickly, updating it requires more extended practice 40,41 . Likewise, non-human animal studies of behavior change usually apply a non-fixed time window of training, lasting until the animal no longer exhibits signs of the original behavior 42,43 . Note, however, that this would be unfeasible in daily life as we may want to change behaviors that were developed and practiced over years or even decades. Consequently, in real-life examples of habit change, holding all other factors constant, we may expect an even weaker acquisition of new behavior and a stronger persistence of old behavior compared to what we observed in the current study. As the same amount of practice for new, preferred behaviors is unfeasible, new approaches need to be found and tested. Importantly, any such approach will need to track both the unlearning of old behavior and the acquisition of new behavior, as well as subsequently probe their coexistence-akin to the design of the current study.
What other factors should future research of habit change consider? While here, both the old and new knowledge were acquired incidentally (see also results of the free generation and triplet sorting tasks in the Supplementary Information), encouraging intentional processes during rewiring (e.g., providing explicit instructions on what aspects of behavior to change) may be beneficial, albeit potentially temporary 23 . This is consistent with the observation that aspects of learning may be initially accessible to consciousness, however, after extended practice, at least some components of the automatic, habitual behaviors are no longer consciously accessible 8,44 .
The age when habits are acquired and then changed should also be considered. Although how people of different ages perform in habit change are poorly understood, research has shown that children, especially under the age of 12, are better at acquiring new complex associations underlying automatic behaviors, while older adults show significant difficulties in doing so, compared to young adults 45,46 . Our current study focused on young adults; investigating the same aspects of habit change in other age groups would be particularly important given the aging population across the world 47 . Since habit change involves not only unlearning old, unwanted behavior but also acquiring new, preferred behavior, we expect poorer performance and even stronger persistence of old behavior in older adults. Meanwhile, the childhood advantage in acquiring automatic behavior could be extensively utilized: ensuring that sustainable habits are learned in childhood could be key to succeeding in the global race for sustainability. Besides age, other characteristics of the sample should also be considered in the future: notably, the present study investigated educated young adults from the western world (often referred to as WEIRD people 48 ), potentially limiting the generalizability of the present findings to a subgroup of the global population.
The present study applied an experimental design that was novel in several respects. First, we could track two key components of changing habit-like behaviors, that is, the acquisition of new knowledge and the simultaneous unlearning of old knowledge within the same task. Second, we investigated complex associations that could be acquired by responding to probability-based relationships between events of a stimulus stream, as opposed to more commonly used simple(r) stimulus-response associations in lab-based tasks. Third, we tested rewiring and the role of inhibitory control without explicit rewards or reinforcers, contrary to most human and non-human lab-based studies 27,43 . We considered this important as using rewards could evoke processes that are specifically related to the reward itself and would change the motivational/emotional aspects of habit change, possibly confounding the measurement of reward-independent learning processes underlying habit formation and change. These characteristics allowed us to more closely model how humans naturally develop habit-like behaviors 44 www.nature.com/scientificreports/ numerous experimental tasks to test habit learning and change, all grasping (at least somewhat) different aspects of these processes (for more details see the "Behavioral and neural characteristics of habit learning across human and animal studies" section in the Supplementary information), further studies are needed to adapt our design to and test the role of inhibitory control in habit change with other tasks as well.
In conclusion, using a novel experimental design, we found that even though it is possible to acquire new habit-like behaviors, a parallel inhibition of the unwanted behavior may be maladaptive and may even strengthen the behavior we want to overcome. Thus, although inhibiting unwanted automatic behavior might be a natural reaction when attempting to replace unwanted, unsustainable habits with preferred, sustainable ones 19 , employing inhibitory control during habit change seems to have no beneficial effect on this process. The design developed here could be used to test new approaches to habit change, thereby uncovering how they affect the cognitive basis of old and new habit-like behaviors, independent of reward effects, in healthy adults and other populations. This can help us develop new intervention techniques for habit change and thereby create more adequate policies, improving our odds of replacing unwanted automatic behaviors with preferred ones.

Methods
Participants. Thirty-three healthy undergraduate students participated in the experiment. They were attendees of a non-compulsory university course where course credits could be obtained by participating in scientific experiments and were randomly assigned to the present study. The sample size was determined based on previous studies using similar experimental tasks in within-subject designs 23,24 (for details, see the "Estimation of required sample size" section in Supplementary Information). Participants had normal or corrected-to-normal vision. None of them reported a history of any psychiatric or neurological condition, or substance use. One participant dropped out of the experiment due to technical errors during data collection. Another participant was excluded due to consistent outlier performance (± 2 SDs) on RT measures throughout the experiment. Therefore, 31  Design. The experiment consisted of three phases, each separated by a 24-h (± 1 h) offline delay (Fig. 1).
During the Learning phase (Day 1), participants performed a widely used and reliable 55 four-choice visuomotor reaction time task called Alternating Serial Reaction Time (ASRT) task 29,56 , in which they acquired the associations of Sequence A. This is referred to as old knowledge throughout the paper. During the Rewiring phase (Day 2), a structural change was implemented in the task by introducing Sequence B. This change prompted the rewiring of old knowledge by acquiring associations of the new sequence. This is referred to as new knowledge. In this phase, participants were asked to suppress their responses on some trials (stimuli underlined with a red line during the task; No-go trials), while they were allowed to respond on other trials (Go trials). During the Testing phase (Day 3), participants completed a shorter version of the task, and performance was tested on both Sequence A and Sequence B in a counterbalanced order. In this phase, participants responded on all trials, including the ones that were No-go trials during the Rewiring phase. This enabled us to test how inhibitory control during rewiring affected the unlearning of old associations and the simultaneous acquisition of new associations. Throughout the experiment, participants were informed that they would participate in an experiment assessing reaction times and response accuracy changes over extended practice; thus, both learning and rewiring occurred incidentally 57 . This was chosen because in everyday situations many habits are developed incidentally 18,44 ; note the current study aimed to test the role of inhibitory control on (un)learning processes and not the effect of incidental vs. intentional processes on rewiring, for that see 23 . For the detailed description of the ASRT task and the structural changes introduced in the Rewiring phase, see the "Supplementary methods" section in the Supplementary Information.
At the end of the Testing phase, a free generation task and a triplet sorting task were administered to probe whether participants acquired consciously accessible knowledge about the sequence and/or the probability structure of the task using recall-and recognition-based approaches, respectively. Since these tasks were not designed to contrast knowledge gained/rewired on Go vs. No-go trials, they served the sole purpose of testing whether any knowledge throughout the task became consciously accessible; the results are reported in the Supplementary Information for comparability across studies and to support future meta-analytic efforts.
Statistical analysis. Learning phase and rewiring phase. To track the trajectory of the acquisition and unlearning of old knowledge and the simultaneous acquisition of new knowledge, we analyzed the Go trials of these two phases. First, trials were categorized based on whether they were high-or low-probability in the Learning phase (according to Sequence A) and whether they were high-or low-probability subsequently in the Rewiring phase (according to Sequence B). This resulted in four trial types: HL, LH, HH and LL, in which the first letter denotes the probability in the Learning phase and the second letter denotes the probability of the same trial in the Rewiring phase ( Fig. 2a; H-high-probability, L-low-probability). Second, data were grouped into three periods, each containing 15-15 ASRT blocks for both phases. Third, for each participant, period, and trial type, median RTs for correctly responded trials were computed.
Fourth, learning scores were computed as differences in response times on trials with changed (LH or HL) versus unchanged occurrence probabilities (LL or HH). Specifically, we expected that participants would become increasingly faster on HL trials during the Learning phase, as compared to the LL trials (for raw RT performance www.nature.com/scientificreports/ see Fig. S1 in the Supplementary Information), resulting in increasingly higher learning scores ('LL minus HL' , Fig. 3a) in this phase. This would indicate the acquisition of old knowledge 24,29 . Then, in the Rewiring phase, unlearning of this knowledge would be reflected in smaller/decreasing learning scores as in this case the initially high-probability trials became low-probability. Furthermore, we expected similarly slow responses to LH and LL trials in the Learning phase (reflected in near-zero learning scores) as here both were low-probability, and then faster responses to LH than LL in the Rewiring phase (reflected in increasingly higher/positive learning scores, 'LL minus LH' , Fig. 3b), indicating the acquisition of new knowledge in this phase. The LL trials served as a baseline for these learning scores as they helped control for general practice effects, while no speed-up was expected on them due to probability-based learning as they were low-probability in both phases. Finally, repeated-measures analyses of variance (ANOVAs) with Phase (Learning vs. Rewiring) and Period (Period 1, 2, 3) as within-subject factors were performed separately for the two learning scores (testing old and new knowledge).
Testing phase. In this phase, participants responded on all trials, including the ones that were No-go trials in the Rewiring phase. Therefore, both previously Go and No-go trials were analyzed here to test how inhibitory control during rewiring affected the old and new knowledge.
First, all trials were categorized as described above, resulting in four trial types (HL, LH, LL or HH). Second, data were grouped according to the tested sequence (Sequence A vs. Sequence B), both containing ten-ten ASRT blocks. Third, for each participant, each sequence, each trial type, and each response type (Go or No-go in the Rewiring phase), median RTs for correct trials were computed (for raw RTs see Fig. S1b in the Supplementary  Information). Fourth, learning scores ('LL minus HL' and 'LL minus LH' for old and new knowledge, respectively) were computed as described above, separately for Sequence A and Sequence B, and separately for the previously Go vs. No-go trials. Finally, repeated-measures ANOVAs with the tested Sequence (Sequence A vs. Sequence B) and Inhibition (Go vs. No-go) as within-subject factors were performed separately for the two learning scores (testing old and new knowledge). This design enabled us to test (i) whether the old and new knowledge coexisted and was present even when it was irrelevant in a given context (e.g., positive learning score for the old knowledge when tested on Sequence B), and (ii) how inhibitory control during rewiring affected the old and new knowledge in these contexts (by contrasting performance on the previously Go vs. No-go trials, see Fig. 4).
In all analyses, Greenhouse-Geisser epsilon (ε) correction was used when necessary. Original df values and corrected p values (if applicable) are reported together with partial eta-squared (η p 2 ) as the measure of effect size. For the significant interactions of the ANOVAs, pair-wise comparisons were performed using LSD post-hoc tests. We report Cohen's d as a measure of effect size for pair-wise comparisons. Additionally, inverse Bayes factors were computed using default JASP priors (JASP v.0.14.1.0 58 ) to see if data provided evidence for the results obtained in the frequentist t-tests (anecdotal evidence for the null-hypothesis: 1 < BF 01 < 3, at least substantial evidence for the null-hypothesis: BF 01 > 3; anecdotal evidence for the alternative hypothesis: 1 > BF 01 > 1/3, at least substantial evidence for the alternative hypothesis: BF 01 < 1/3) 59 . To provide further contrasts across the learning scores of the old vs. new knowledge, additional analyses were performed where relevant (see Supplementary Information). All statistical tests were two-tailed. Figures were created using the ggplot2 package 60 .
Although RTs were the primary measures of interest in the current study, we performed similar analyses on the accuracy measures as well. These results are reported in the Supplementary Information, along with the results of the two additional tasks (free generation and triplet sorting tasks), which tested whether participants gained consciously accessible knowledge about the sequence and/or probability structure of the learning task.