Statistical learning of distractor locations is dependent on task context

Through statistical learning, humans can learn to suppress visual areas that often contain distractors. Recent findings suggest that this form of learned suppression is insensitive to context, putting into question its real-life relevance. The current study presents a different picture: we show context-dependent learning of distractor-based regularities. Unlike previous studies which typically used background cues to differentiate contexts, the current study manipulated task context. Specifically, the task alternated from block to block between a compound search and a detection task. In both tasks, participants searched for a unique shape, while ignoring a uniquely colored distractor item. Crucially, a different high-probability distractor location was assigned to each task context in the training blocks, and all distractor locations were made equiprobable in the testing blocks. In a control experiment, participants only performed a compound search task such that the contexts were made indistinguishable, but the high-probability locations changed in exactly the same way as in the main experiment. We analyzed response times for different distractor locations and show that participants can learn to suppress a location in a context-dependent way, but suppression from previous task contexts lingers unless a new high-probability location is introduced.

www.nature.com/scientificreports/ is indeed independent of experimental context (as previous studies suggest), the testing blocks should show lingering suppression that is most pronounced for the last-learned location. That is, in the first testing block one should observe lingering suppression at the location that had a high distractor probability in training context B, and in the second testing block at the high probability distractor location associated with context A. By contrast, if learning is context-dependent, there should be little to no lingering suppression, but instead a revived suppression from the associated context (Fig. 1B). To further characterize the context effect, we also ran a control experiment in which the spatial distractor balance shifted across blocks in the same way as the main experiment, but critically there were no longer different task contexts that could be linked to this distractor manipulation (i.e., the task stayed the same throughout the entire experiment).

Main experiment
Statistically induced distractor suppression has been shown to occur within comparable time spans and experimental setups in compound search tasks, where participants respond to the line orientation inside the uniquely shaped target 14 , as well as visual detection tasks, where participants respond to the presence or absence of a uniquely shaped target 47 . In Experiment 1 we combined these variants of the additional singleton paradigm into a single experiment. We reasoned that they were different enough to induce a different task context, yet similar enough to be analyzed in a comparable way. 44 , an effect with d = 0.6 would require a sample size of 31 to get β = 0.90 when α = 0.05. However, since they did not find a significant result, we attempted to detect a smaller effect size (d = 0.45), which required a sample size of 54 to get β = 0.90 when α = 0.05. The number of non-discarded participants equaled or exceeded this minimal sample size in both experiments. Fifty-four adults (26 male, 27 female, 1 non-binary, mean age = 31, age range: 22 to 50) participated in Experiment 1 through Prolific 49 . They all reported having normal or corrected-to-normal (color) vision, and at minimum an undergraduate degree. Participation took approximately 35 min and participants earned £4, 70. The experiment was approved by the Ethical Committee of the faculty of Behavioral and Movement Sciences of the Vrije Universiteit Amsterdam. Before the experiment, all participants gave informed consent and all the methods were performed in accordance with the Declaration of Helsinki.

Methods. Participants. Following Britton and Anderson
Apparatus and stimuli. Because the experiment took place online, some factors (e.g. lighting and seating conditions) could not be controlled. For replication purposes, item sizes and colors are reported in pixels and RGB values (red/green/blue). The experiment was created in OpenSesame 50 using OSweb 1.4.11, and run using JATOS 51 .   Fig. 2. Six shapes (one circle and five diamonds, or vice versa) were presented on an imaginary circle with a radius of 180 px. The circles and diamonds were 126 and 157 px high, respectively, in red (255/0/0) or green (0/200/0). Depending on the task, the center of the display contained a blue (37/146/242, radius: 14 px) or yellow (253/203/41) circle with a white P (for presence) or T (for tilt, Verdana Bold, height: 19 px) inside it, respectively. In the T-task (compound search), each shape contained a grey (128/128/128) 45° or − 45° line (57 × 8 px). The background was dark grey (94/94/94). Figure 2 gives a schematic overview of a trial. The duration of the fixation period was randomly selected between 1000 and 1250 ms. The search display was visible until response or until a 3000 ms limit was exceeded. Participants searched for a unique shape (i.e., a circle among diamonds or vice versa). In the detection task (cued by a blue P) they reported whether it was present (left arrow key) or absent (right arrow key). In the compound search task (cued by a yellow T) they indicated the tilt direction of the line segment embedded within that shape (left/right arrow key). The P or T was already visible during the fixation period in order to cue the task. If the response was incorrect or too slow, a brief error message ('Oops!') was shown for 500 ms, followed by a written instruction of the task until a keypress.

Procedure and design.
The target was always the uniquely shaped item, while the distractor was the uniquely colored item. A uniquely colored distractor was present on 50% of the trials. Whereas the target location was selected at random across all blocks, in the training blocks one distractor location occurred more often (72%) than the other locations (5.6% per location). The two high-probability locations were determined randomly for each participant, with the high-probability location of one task always opposite to that of the other task. The high-probability location for each task remained constant throughout the experiment. Critically, in testing blocks this spatial imbalance was removed and the distractor appeared with equal probability across all locations. In the compound search task, each shape in the display contained a line that was left or right-tilted at random. Participants were instructed to report the orientation of the line inside the target (i.e., the unique shape). In the detection task, the target was absent in half of the distractor-present trials and in half of the distractor-absent trials and participants were instructed to indicate whether or not the display contained a unique shape. Figure 1A provides an overview of the block order. The experiment was separated into training blocks (72 trials) and testing blocks (144 trials). Participants performed a training block of task A, followed by a training block of task B and a testing block of task A. Next, they performed a training block of task B, followed by a training block of task A and testing block of task B. Before every block, participants were told if it was going to be a P-block (detection task) or a T-block (compound search task), so that a context shift could be anticipated prior to starting the first trial of a new block. Every training block started with a practice block of 20 trials, that was repeated until accuracy reached at least 70% (on average, participants performed each practice phase 1.06 times). Practice trials were included also in the second half of the experiment so that the total number of trials in which the regularity could be learned (this includes practice trials) in the first and second half of the experiment Figure 2. Schematic overview of a trial, with the top presenting the detection task and the bottom the compound search task. The search display was visible until a keyboard response was provided or the 3000 ms limit was exceeded. In the detection task participants reported whether a unique shape was present (left arrow key) or absent (right arrow key). In the compound search task participants indicated the tilt direction (left/right) of the line segment inside the unique shape. Incorrect or too slow responses were followed by the word 'Oops!' and a written instruction of the task. Figure created  www.nature.com/scientificreports/ was approximately equal. A break was included after every block, and halfway through every testing block. The specific order of tasks was counterbalanced across participants so that half started with the compound search task (yellow T) and half started with the detection task (blue P). Awareness of the spatial regularities was assessed after all trials were completed. First, participants were asked whether the distractor appeared more frequently in one location than in other locations. Next, they were asked to indicate which location they thought was the high-probability location by typing in a location-based number (0-5), separately for each specific task (so twice in total, with a different central cue to indicate the task context).
Analyses. For response time (RT) analyses, we removed incorrect trials (5.2%) and performed a cleaning procedure (1.7%): RTs that deviated more than 3 SD from the mean (computed separately for each participant and each task) were also removed. Overall, this data cleaning procedure resulted in a loss of 6.9% of the data. The data for one participant was discarded because the overall accuracy was below 75%. As there was no evidence in support of a speed-accuracy trade-off (neither here, nor in the control experiment), we only report RT results in the main text. Accuracy results are reported in the supplementary analyses. For simplicity, we only report distractor-based analyses in the main text, but target-based analyses are reported in the supplementary analyses.
Since half of the trials in the main experiment are not suitable for a target-based analysis 47 , we only performed those analyses on the control experiment. All t-tests are planned comparisons, unless stated otherwise 52 . If an analysis averaged across two tasks, an average was first calculated separately for each task, and then combined into a single average per participant. If no average could be computed for one of the two tasks, the data of that participant (one participant in Blocks 1 and 4) was omitted from the analysis. ANOVAs, t-tests, and Bayesian equivalents were performed using Jamovi 53 . For Bayesian analyses we used the default Cauchy distribution (scale = 0.707) as the prior. We report BF 10 (expressing the strength of evidence in favor of the alternative hypothesis), meaning that BF 10 < 1 is in favor of the null-hypothesis (with the strength of evidence increasing as the BF approaches zero). The verbal labels used to describe the strength of the evidence (in either direction) are based on an established classification 54,55 . A Greenhouse-Geisser correction was applied to the ANOVA results when the sphericity assumption was violated.

Results.
The main analyses are based on data that is collapsed across the two tasks to increase statistical power and focus on context-dependent location learning rather than the specifics of each task. Separate analyses for each task in the testing blocks can be found in the supplementary analyses (Fig. S2).
Training blocks. Figure 3A and B show response times from the training blocks. The practice trials are also included in this analysis, because lingering suppression effects are expected to be strongest immediately following a switch (but including the practice trials does not change the pattern of results). The training blocks are split into Blocks 1 and 4, the first training blocks of each experiment half, and Blocks 2 and 5, the last training blocks before the testing blocks. The match condition refers to the high-probability location that is associated with the current task context, the mismatch condition refers to the high-probability location this is associated with the other task context, and the low-probability condition refers to the four remaining locations. It should be noted that the mismatch condition is based on only 4 trials (excluding practice trials) per participant, and therefore does not provide an accurate RT estimate. In Blocks 1 and 4 ( Fig. 3A), RTs differed significantly between distractor conditions (absent, match, lowprobability, mismatch), F(1.85, 96.4) = 32.3, p < 0.001, η 2 p = 0.38. Participants learned to suppress the current high-probability location, as responses were faster at the match location than the low-probability locations, t(52) = 3.55, p < 0.001, BF 10 = 32.6, d = 0.49. The mismatch location was numerically the least suppressed location, but it did not differ significantly from the low-probability locations, t(52) = 0.36, p = 0.719, BF 10 = 0.16, d = 0.49 (where the BF provides moderate evidence for the absence of a difference). It did differ from the match location, t(52) = 2.5, p = 0.016, BF 10 = 2.49, d = 0.34 (although the BF is inconclusive).

Revival of suppression.
A crucial finding in the response time analyses is that the suppression for the firstlearned location (i.e., location A in the first half of the experiment, and location B in the second half) disappears when the task context changes and a second high-probability location is introduced (in the second and fifth blocks), but reappears when the first-learned context is encountered again in the testing block. To focus on this revival of suppression, Fig. 4A shows the difference scores between the first-learned location and the lowprobability locations across blocks. We argue that this revival of suppression is driven by the retrieval of the first-   www.nature.com/scientificreports/ learned location from memory, based on its associated context (see also Fig. 1B). This also means that learning took place in a context-dependent way.
Awareness of the regularities. One-third of the participants (33%, here called the 'aware' group) indicated that the distractor occurred more often at some locations than others. Participants were also asked to indicate the high-probability location for each of the two tasks. By taking the distance (ranging from 0 to 3) between the location indicated by the participant and the actual high-probability location for each of the two tasks, and averaging across the two distances, an awareness score was computed for every participant. An awareness score below chance-level could indicate some level of awareness of the high-probability locations. We first computed a 'context-blind' awareness score, only looking at whether participants could indicate the two high-probability locations, independent of context. In this computation, if a participant selected location A for context B and location B for context A, we regarded that as a perfect score (0). This context-blind awareness score (mean = 0.94, SD = 0.52) was not below chance-level (0.87), t(53) = 0.82, p = 0.818, BF 10 = 0.08, d = 0.13 (the BF in fact provides strong evidence for chance-level performance). Furthermore, the score also did not differ significantly from chance for the 'aware' group, t(17) = 1.07, p = 0.851, BF 10  We also computed a 'context-dependent' awareness score, looking at whether participants could indicate the correct high-probability location for each of the two contexts separately. In this computation, if a participant selected location A for context B and location B for context A, we regarded that as the worst possible score (3), because each selected location was 3 positions away from the context-dependent location. This context-dependent awareness score across participants (mean = 1.6, SD = 0.77) was not below chance-level (1.5), t(53) = 0.98, p = 0.833, BF 10 = 0.08, d = 0.08 (the BF in fact provides strong evidence for chance-level performance). Furthermore, the awareness score also did not differ significantly from chance for the 'aware' group, t (17)  Discussion. The results of the main experiment were not in line with our predictions for context-dependent learning, nor our predictions for context-independent learning. On the one hand, there appears to be some context-dependent learning, as the high-probability location that was learned in Blocks 1 and 4 was no longer suppressed in Blocks 2 and 5, while suppression for this location (match) resurfaced in the testing blocks. Since all distractor locations were equiprobable in the testing blocks, this resurfaced suppression appears to be the result of context-dependent learning: upon recognizing a previously learned task context, the associated highprobability location was suppressed (see also Fig. 4A). However, we also observed lingering suppression for the mismatch location in the testing blocks, and this was not in line with our prediction of context-dependent learning.

Control experiment
To be able to better understand the revival of suppression in the testing blocks and the role of context therein, we ran a control experiment that followed the same experimental design, except that we removed all context switches. Participants performed the same task (the compound search task) throughout the experiment. Crucially, the high-probability distractor location changed between blocks in exactly the same way as in the main experiment. That is, the order of high-probability locations across blocks was: A, B, none (testing block), B, A, none (testing block). To facilitate comparison between the main experiment and the control experiment, we labelled the conditions the same way as in the main experiment, as if the high-probability locations were matching or mismatching the context (see Fig. 1A). We reasoned that if we again observed revival in this control experiment, we could rule out context-dependent learning. If we found a different pattern of results, that difference could be attributed to the switches in context and would indicate context-dependent learning in the main experiment. In the main experiment, a separate analysis of the compound search task yielded significant suppression for the match location (Fig. S2A), such that the difference between the main experiment and the control experiment cannot be ascribed to the characteristics of the detection task.
Methods. Fifty-five adults (30 male, 23 female, 2 non-binary, mean age = 32, age range: 21 to 50) participated. The procedure was identical to that of the main experiment, with the exception that the task remained the same throughout the experiment (i.e., a compound search task). As in the main experiment, practice blocks were repeated until accuracy reached at least 70% (on average, participants performed each practice phase 1.08 times).
Analyses. The analyses were identical to the main experiment with the exception that we did not separate the analyses per context. While there were no actual context changes in the control experiment, we use the same condition labels as in the main experiment. Thus, 'match' refers to the location that would have been matching the context if there was one. Furthermore, the control experiment also allowed for analyzing response times in relation to the target location, because the compound search task lends itself to that analysis. Those results are reported in the supplementary analyses. Removal of incorrect trials (4.4%) and RT filtering (1.4%) resulted in an overall loss of 5.8% of the data.  Figure 3D and E show response times from the training blocks. The results from the training blocks for the control experiment follow the pattern of results of the main experiment. In Blocks 1 and 4, RTs differed significantly between distractor conditions, F(1.75, 94.55) = 39.9, p < 0.001, η 2 p = 0.43. Participants learned to suppress the current high-probability location, as responses were faster at the match location than the low-probability locations, t(54) = 5.61, p < 0.001, BF 10 = 21,873, d = 0.76. The mismatch location was numerically the least suppressed location, but it did not differ significantly from the low-probability locations, t(54) = 0.22, p = 0.826, BF 10 = 0.15, d = 0.03 (where the BF provides moderate evidence for the absence of a difference). It did differ from the match location, t(54) = 3.05, p = 0.004, BF 10  Testing blocks. Figure 3F shows response times from the testing blocks. RTs differed significantly between dis- Awareness of the regularities. As in the main experiment, approximately one-third of the participants (36%, here called the 'aware' group) indicated that the distractor occurred more often at some locations than others. Since context in the control experiment was not differentiated, we only computed a 'context-blind' awareness score. In contrast to the main experiment, this context-blind awareness score (mean = 0.74, SD = 0.5) was below chance-level (0.87), t(54) = 1.99, p = 0.026, BF 10 = 1.77, d = 0.27 (though the BF was inconclusive). However, the score was not significantly below chance for the 'aware' group, t (19)  Discussion. The response time results of the control experiment were in line with the main experiment, except for one crucial difference: there was no revival of suppression at the matching location in the testing block (see also Fig. 4B). We interpret this finding as follows. Given the alternating task context across blocks (see Fig. 1A), the match location in the testing blocks was learned first (in the first and fourth blocks), while the mismatch location in the testing blocks was learned second (in the second and fifth blocks), right before the testing blocks. The suppression for the match location was overridden by the new high-probability location in the second and fifth blocks, and the suppression of this location continued in the testing blocks. Crucially, the suppression at the match location was not revived in the testing blocks, because there was no context to actually enforce such a revival. Note that when the main experiment and the control experiment are compared on the basis of the compound search task alone, the same interpretation holds, because separate analysis of the compound search task yielded significant suppression at the context-matching location (Fig. S2A), similar to the collapsed analysis (Fig. 3C).
In contrast to the main experiment, the control experiment provided some evidence of awareness of the high-probability distractor locations. It should be noted however that the awareness score is based on only two responses, and therefore may not provide a reliable estimate. Previous studies [57][58][59] have argued that common post-experiment awareness questionnaires are insufficient to conclude unawareness, but this argument also works in the other direction. The measure is simply so unreliable that it is questionable whether it is meaningful at all. Perhaps unsurprisingly then, the Bayesian analyses were inconclusive, and the awareness score was no longer significant when we only included participants that actually indicated an imbalance in the distractor distribution (the 'aware' group) when asked about this. We conclude that the difference between the main experiment and the control experiment is unlikely to be due to a difference in awareness of the distractor regularities.

General discussion
Statistical regularities regarding the location of distractors in visual displays have been shown to facilitate search. This is because humans can learn to suppress locations that are more likely to contain a distractor 11,13,14,47 . Somewhat surprisingly, a series of recent findings suggested that this form of learned suppression is insensitive to context 44,45 . The current study however presents a different picture: we show that participants can actually learn to suppress a location in a context-dependent way, at least so long as each context is presented for an extended period of time (i.e., blocks of trials), and the contexts are dissimilar enough. We were able to uncover these findings by using a blocked design with separate training and testing blocks (Fig. 1A), in contrast to previous studies which used a mixed design 44,45 . Furthermore, as a means of differentiating contexts more strongly than previous studies, the task alternated from block to block between a compound search task and a detection task. Crucially, a different high-probability distractor location was assigned to each context in the training blocks, and all distractor locations were made equiprobable in the testing blocks. In the control experiment, the contexts were made indistinguishable for participants, but the high-probability locations changed in exactly the same way as in www.nature.com/scientificreports/ the main experiment. Upon encountering a familiar task context in a testing block, participants showed a revival of suppression for the location that was associated with that context in an earlier training block. Crucially, this revival of suppression for a familiar context was absent in the control experiment, simply because there were no contexts to associate the distractor locations with. Our findings are the first evidence for context-dependent spatial suppression resulting from statistical learning, providing a novel and important insight into the mechanisms behind the implicit learning of distractor locations. The finding that distractor location probabilities can be learned in a context-dependent way is crucial for the relevance of history-based attentional effects. Without some form of contextual binding, learned suppression effects would require a constant re-learning of the statistical regularities of contexts that have already been encountered. On a more theoretical level, context-dependency can be taken as evidence for an intelligent learning mechanism that maintains multiple regularities over time, as opposed to a simple habit-like mechanism where only the last-learned regularity is available. Furthermore, our findings reconcile the previously observed generalized suppression 44,45 with the presumed context-dependency of selection history effects 5 and the observed context-dependency in the reward learning 31 , contextual cueing 35 , and broader statistical learning literature 41,42 . However, our findings do not imply that the learning of multiple distractor regularities is always context-dependent. It remains to be seen whether weaker context manipulations such as background images can also instantiate context-dependent effects when applied in a blocked design. Context switches may simply need to be sufficiently spaced in time for context-dependent learning to occur, because otherwise two contexts would need to be updated more or less in parallel. It is also possible that in mixed designs 44,45 context-dependent learning did occur but was obscured in the data by lingering suppression effects.
Besides context-dependent learning we also observed lingering suppression from the directly preceding context. As described, testing blocks yielded a revival of earlier suppression matching the context of the testing block, but there was also suppression that mismatched with the testing block's context (although the BF was inconclusive in the main experiment). In other words, both the high-probability distractor locations were suppressed in the testing blocks. This was a puzzling finding, because if learning was context-dependent, we expected that suppression would only be active for the location matching the current context. If a location-context association has been learned, it seems a waste of energy to suppress more than what this association dictates. We interpret the mismatching suppression in the testing blocks as lingering suppression from the directly preceding training block. This raises another question: Why would the suppression from the previous training block linger in the testing blocks, but not (or much weaker) in the training blocks? The simplest answer appears to be that a different high-probability location is introduced in the training blocks, whereas the testing blocks contain no high-probability locations whatsoever. Thus, we explain the data as follows: Participants can implicitly learn to associate high-probability distractor locations with their respective contexts, but suppression for a previous high-probability location continues to linger unless a new high-probability location is introduced 13 . It seems unlikely that suppression will linger indefinitely, so we assume that it gradually decreases, but the 144 trials in our testing blocks were not enough to show this.
Lingering suppression could result from a slow neural mechanism that requires time to adapt to a new situation. The neural underpinnings of implicitly learned suppression are still poorly understood 27,60-63 so that many options remain open. In addition to or instead of neural factors, lingering suppression likely reflects the inherent slowness of statistical learning, where repetitions are crucial to filter out the noise in an environment that is riddled with randomness. However, neither of these explanations appear entirely satisfactory in this case, because they cannot explain the stark difference between the strong presence of lingering suppression in the testing blocks, and the absence or near absence of lingering suppression in the training blocks. The low trial counts in crucial conditions unfortunately make it impossible to investigate the time course of learning within blocks, but the averaged results per block suggest that the 72 trials + 20 practice trials of the training blocks were enough to remove most or all lingering suppression, whereas the 144 trials of the testing blocks did very little (if anything) to decrease it. Thus, it appears that the priority map can be quickly updated when a new high-probability location is introduced 64 , but updates slowly when all spatial imbalances are removed 44,65,66 . Part of the answer might lie in the trial counts at the mismatch location specifically (where lingering suppression shows up), because those are only 2 trials per training block and 12 per testing block. The low trial count at the mismatch location in the training blocks could have made it difficult to let go of previous suppression, or it could simply mean that the response time for that specific condition was unreliable. In general, findings on the time course of learning and unlearning often do not converge, suggesting that many factors play a role 11,44,[66][67][68][69] .
In sum, the present findings show that statistical learning of spatial suppression is context dependent, at least so long as the contexts are dissimilar enough and presented in blocks. However, we also observed lingering suppression for locations mismatching with the context. A control experiment with no contextual differentiation whatsoever confirmed that learning in the main experiment must have been context-dependent. The long-lasting lingering suppression is a finding that requires further explanation, and we have made several suggestions for avenues of further investigation. Our findings substantially increase the real-life relevance of implicitly learned attentional suppression. Naturally, the present results are limited in that they only consider distractor suppression as opposed to target enhancement, and implicit as opposed to explicit learning. Furthermore, we have operationalized context by using two similar but distinct tasks, and it remains to be seen whether context-manipulations that are not task-relevant can also be learned in a context-dependent way 70,71 , and whether context-manipulations that are even stronger might result in less lingering suppression. In this sense, the current study raises a series of interesting questions about the learning and lingering of context-dependent regularities in visual search as well as in other domains of cognition, which could be explored in future research.