Investigating interactions between types of order in categorization

This study simultaneously manipulates within-category (rule-based vs. similarity-based), between-category (blocked vs. interleaved), and across-blocks (constant vs. variable) orders to investigate how different types of presentation order interact with one another. With regard to within-category orders, stimuli were presented either in a “rule plus exceptions” fashion (in the rule-based order) or by maximizing the similarity between contiguous examples (in the similarity-based order). As for the between-category manipulation, categories were either blocked (in the blocked order) or alternated (in the interleaved order). Finally, the sequence of stimuli was either repeated (in the constant order) or varied (in the variable order) across blocks. This research offers a novel approach through both an individual and concurrent analysis of the studied factors, with the investigation of across-blocks manipulations being unprecedented. We found a significant interaction between within-category and across-blocks orders, as well as between between-category and across-blocks orders. In particular, the combination similarity-based + variable orders was the most detrimental, whereas the combination blocked + constant was the most beneficial. We also found a main effect of across-blocks manipulation, with faster learning in the constant order as compared to the variable one. With regard to the classification of novel stimuli, learners in the rule-based and interleaved orders showed generalization patterns that were more consistent with a specific rule-based strategy, as compared to learners in the similarity-based and blocked orders, respectively. This study shows that different types of order can interact in a subtle fashion and thus should not be considered in isolation.

. Combination of the three manipulations (within-category, between-category, and across-blocks) that generates the eight conditions of the experiment. The last column offers an example of a two-block sequence for each condition. Stimuli can belong to either Category A (letters a, a ′ , A, and A ′ ) or Category B (letters b, b ′ , B, and B ′ ). These categories were chosen for this table as simple as possible to exemplify different types of order more easily. Note that this category structure is not the one used in our experiment. Stimuli associated with the main rule are in capital letters, while the exceptions are in small letters.

Index
Within-category Between-category Across-blocks Two-block sequence Scientific Reports | (2022) 12:21625 | https://doi.org/10.1038/s41598-022-25776-0 www.nature.com/scientificreports/ Because we tested the effect of new factors manipulating presentation orders and combinations of these factors, we decided to use the widely employed 5-4 category structure from Medin and Schaffer 50 to generate the stimuli and categories, and to study the strategies engaged by participants. A detailed description of the 5-4 category set can be found in Sect. 2.2. This structure has been analyzed in numerous studies and has influenced research in category learning for more than a quarter century [51][52][53][54][55][56][57][58][59][60][61][62][63][64][65][66] . Moreover, the artificial structure of this category set allows for the presence of stimuli without a category label. The advantage is that on these stimuli different classification strategies lead to distinctive response patterns, allowing us to study the mental representation of the categories (see details in Sect. 3.2). For these reasons, the 5-4 category set appeared to be a fruitful starting point for our investigation.

Method
Participants. Two hundred and sixteen participants contributed to this study. We initially recruited 218 participants, but two participants were excluded from the data set for failing to follow instructions. Among the 216 participants, 130 were sophomore or junior students from University Côte d' Azur who received course credits in exchange for their participation. The remaining 86 participants were recruited on campus or by email on a voluntary basis. We used G * Power 67 to estimate the power of detecting a small-medium effect size ( f = 0.2 ) for the interaction between the three types of order manipulation ( 2 × 2 × 2 = 8 between-subject groups) with a three-way ANCOVA model, considering 216 participants, 1 co-variate (i.e., the block number), and α = 0.05 . The power achieved was 83%. Note that the data-set corresponding to the first 130 participants has already been used in 68 for testing categorization models. The experimental procedure was approved by the local ethics committee (CERNI #2020-74) of Université Côte d' Azur and the experiment was performed in accordance with relevant guidelines and regulations. Informed consent was obtained from all participants prior to participation.
Categories. Each participant was administrated a single 5-4 category set 50 . This structure is composed of 16 stimuli, varying on four different binary-valued dimensions (see Fig. 1, on the top). In this category set, five stimuli belong to category A, four belong to category B, and the remaining seven are transfer stimuli. These categories are more structured than random (i.e., a clear rule-plus-exceptions pattern emerges) and are linearly separable. The 5 + 4 = 9 stimuli characterized by a category label were presented in both the learning and transfer phases, whereas the 7 transfer stimuli were presented in the transfer phase exclusively.
Stimuli. Stimuli varied along four Boolean dimensions (Color, Shape, Size, and Filling pattern). The colors were either blue or red; shapes were either square or circle; sizes were either small or big, and filling patterns were either plain or striped. The combination of these options formed 2 4 = 16 items (see Fig. 1, on the bottom). Color distinguished the objects at the front of the hypercube from those at the back, Shape distinguished the objects in the left cube from those in the right cube, Size distinguished the right and left objects within the cubes, and Filling pattern distinguished the objects at the top of the hypercube from those at the bottom. Each dimension was instantiated by the same physical features and the same category structure was applied to these features across participants.  www.nature.com/scientificreports/ Phases. A learning phase in which participants were instructed to learn the classification of 5 + 4 = 9 learning stimuli was followed by a transfer phase in which participants were tested upon presentation of 7 novel stimuli (plus the 9 stimuli previously acquired). In the learning phase, both feedback and no-feedback training were used. In particular, two blocks of feedback training (in which the order of the stimuli was manipulated) were followed by one block of no-feedback training (in which stimuli were randomly presented). This pattern was repeated until the end of the learning phase. Each training block (feedback and no-feedback) included nine trials, one for each stimulus. The use of random blocks with no-feedback allowed us to assess learning with neither order manipulation nor feedback interfering with the measure of performance. The unbalanced ratio of two blocks of feedback training followed by one block of no-feedback training aimed at increasing the influence of our manipulation, with the idea that the random block could still interfere with the learning process. Participants had to correctly classify stimuli in three no-feedback blocks of 5 + 4 = 9 stimuli (not necessarily consecutive) to complete the learning phase. The choice of three is arbitrary, but appeared to be a good trade-off between maximizing the memorization of the categories and minimizing the duration of the task (a fundamental point considering that the task was conducted online). Participants were given 200 blocks at the most to reach the learning criterion. Once participants met the learning criterion, the transfer phase was initiated. Participants were informed that they successfully completed the learning phase and that the transfer phase was about to start. The transfer phase was composed of five blocks of 16 stimuli (the 5 + 4 = 9 learning stimuli and the 7 novel stimuli), summing to 80 trials.
Ordering of stimuli. The experiment was characterized by a full factorial design. Three factors were used, each one having two levels: a within-category order manipulation (Rule-based vs. Similarity-based), a betweencategory order manipulation (Blocked vs. Interleaved), and a manipulation of order across blocks (Variable vs. Constant). The combination of these factors formed eight conditions (e.g., "Rule-based + Interleaved + Constant", etc.). For simplicity purposes, each condition is denoted using the first letter of each type of order. For instance, condition "Rule-based + Interleaved + Constant" is denoted R+I+C. As mentioned above, order was only manipulated in the blocks of the learning phase where feedback was provided. The number of participants assigned to each condition is given in Table 2.
Within-category order manipulation. In the rule-based order, stimuli were ordered following a "principal rule plus exceptions" structure, meaning that examples obeying the principal rule were presented strictly before the exceptions. The specific "principal rule plus exceptions" structure of our experiment was the following: all striped items belong to category A except for the small red square, while all plain items belong to category B except for the big red circle (see Fig. 1). Therefore, items A 1 , A 2 , A 3 , A 5 were strictly presented before item A 4 , and items B 1 , B 2 , B 4 were strictly presented before item B 3 . The items belonging to the principal rule (whether belonging to categories A or B) were randomly selected. Presenting stimuli belonging to the dominant rule in a random order was thought to favor an abstraction process, given that other sequences would have increased the risk of temporarily inducing less informative rules, thus delaying learning. Note that instead of using a principal rule based on Filling pattern (plain vs. striped stimuli), we could have used a principal rule based on Shape (circles vs. squares). Indeed, both rules minimize the number of exceptions.
In the similarity-based order, members within a category were presented in a way that maximized the similarity between adjacent learning stimuli. The first stimulus was randomly chosen while subsequent stimuli were (randomly) chosen among those that were the most similar to the immediately previous item. Similarity between two items x and y was computed by counting the number of common features they shared: where x i and y i are the feature values of stimuli x and y on dimension i. For instance, the small plain blue circle and the small striped red square have one single feature in common (small), thus their similarity is 1.
Between-category order manipulation. In the blocked study, categories were strictly blocked (AAAABBBB or BBBBAAAA ), while in the interleaved study categories were strictly alternated (ABABABAB). Because of the regularity of both patterns, the introduction of random blocks during learning was necessary. Indeed because of these repetitive patterns, participants could have guessed the correct classification without paying attention to the stimuli. The ratio between blocked (or interleaved) blocks and random blocks is 1:3, as for the feedback/ Table 2. Number of participants assigned to each of the eight conditions of the experiment. The table also includes the order manipulation for the two participants (assigned to conditions R+B+C and R+B+V) who were excluded from the study for not following instructions.

Rule-based
Similarity-based www.nature.com/scientificreports/ no-feedback blocks. Therefore, a random block with no-feedback always follows two blocks in which categories are blocked (or interleaved) and feedback is provided. Note that in random blocks feedback was never provided, whereas in blocked/interleaved blocks feedback was always provided.
Across-blocks order manipulation. In the constant manipulation across blocks, the same sequence of stimuli (but obeying the constraints of the between-and within-category orders) was presented in all feedback blocks, while in the variable manipulation across blocks the sequence of stimuli varied from one feedback block to another (again, obeying the constraints of the between-and within-category orders).
Procedure. The categorization task was computer-driven and was conducted online. Participants received instructions before the task began. In both phases, stimuli were presented one at a time for 3 s on the center of the computer screen. Category A was associated with the up key, while category B was associated with the down key. Participants had to classify the stimulus in one of the two categories (A and B) using these two response keys. Once the key pressed, a feedback indicating the correctness of participants' classification appeared for 1 s at the bottom of the screen (this was the case only in blocks where feedback was provided). If no key was pressed, the text 'too late' appeared for 1 s at the bottom of the screen. In order to encourage learning, a percentage of correct responses was calculated at the end of each no-feedback block, based on performance on the last nofeedback block only. This percentage was displayed for 1 s right after each no-feedback block.

Results
Learning phase. Two of our main questions of interest are (i) whether the across-blocks manipulation (constant vs. variable) affects the speed at which the concept is learned, and (ii) whether there are interactions between the types of order we manipulated. To answer these questions, we analyzed the time needed by participants to complete the learning phase as a function of the type of order (in Sect. 3.1.1), and we performed a three-way ANCOVA with and without interactions (in Sect.3.1.2) Two additional analyses can be found in Supplementary material A and B. The first analysis examines the number of individuals who did not reach the learning criterion, and shows no significant difference across types of order. The second analysis examines the percentage of correct responses given by participants over the course of the learning phase, and finds faster learning curves in the rule-based order than in the similarity-based order. None of the 216 participants were excluded from the analyses of the learning phase.
Analysis of the learning times. Figure 2 shows the average number of blocks which were required for participants to meet the learning criterion as a function of the experimental conditions, taken separately ( Fig. 2A) and combined (Fig. 2B). Visually, the rule-based order appears more beneficial than the similarity-based order, the blocked order appears somewhat more beneficial than the interleaved one, and the constant condition appears more beneficial than the variable condition. Note that only participants who reached the learning criterion (amounting to 198) were plotted in Fig. 2. To determine which condition led to the fastest learning while accounting for "unsuccessful participants" (i.e., individuals who did not meet the learning criterion), we used two survival analysis techniques: the Kaplan-Meier survival curves and the Cox proportional-hazards model.

Kaplan-Meier survival curves.
We used the Kaplan-Meier estimator 69 to estimate the expected duration of time until the successful completion of the learning phase, considering data from participants who did not complete the task as censored. Figure 3 shows the survival probability as a function of block number for each type of order, taken separately ( Fig. 3A) and combined (Fig. 3B). The survival probability estimates how likely participants assigned to a given condition are to continue the task (i.e., to not meet the learning criterion). The log-rank test was performed to evaluate the difference between survival curves, and significance values were corrected for multiple comparisons using the Benjamini-Hochberg method of false discovery rate control (FDR ≤ .05 ). Note that it is sufficient to compare the adjusted p-values to 0.05 to determine if they are significant for a FDR ≤ .05 . The adjusted p-values were significant for the within-category and across-blocks orders ( p = .047 for rule-based vs. similarity-based, and p = .03 for constant vs. variable), but not for the between-category order ( p = .26 ). This shows that learning was faster in the rule-based and constant orders as compared to the similarity-based and variable orders, respectively.
Cox proportional-hazards model. Similarly to the Kaplan-Meier estimator, the Cox model 70 allows us to consider failures to complete the task as censored data, avoiding to remove unsuccessful participants. This model is particularly advantageous because of its ability to simultaneously account for multiple variables. Therefore, we use it to simultaneously analyze the influence of the three types of order (within-category, between-category, across-blocks orders) on survival probability. Figure 4A shows the result of the Cox model as a function of our three variables (within-category, between-category, across-blocks orders). The graphs show that the similaritybased order, the interleaved study, and the variable manipulation across-blocks reduced participants' hazard ratio as compared to their respective reference condition (i.e., rule-based order, blocked study, and constant manipulation across-blocks). This means that these types of order were found to reduce participants' speed to meet the learning criterion. However, only the impact of across-blocks manipulations was significant ( p = .065 for within-category orders, p = .195 for between-category orders, and p = .038 for across-blocks orders).

Analysis of the interactions between the types of order.
With the previous survival analyses, we only investigated main effects and potentially ignored any subtle interaction between the types of order. In order to assess whether the manipulations interact in a nuanced fashion, we performed a three-way ANCOVA ( 2 × 2 × 2 with interactions) with within-category order (rule-based vs. similarity-based), between-category order (blocked vs. interleaved), and across-blocks manipulation (constant vs. variable) as between-subject factors. The number of correct responses per block was the dependent variable and block number was the only co-variate. To ensure an equal contribution from each participant, we completed participants' responses until block number 63. Since 80% of the participants ended (successfully or not) the learning phase before block number 63, this choice allowed us to ensure an equal number of observations for each participant, while limiting the number of observations that were removed or added. There were 170 participants who ended the learning phase before block number 63. Five of them dropped out of the experiment, while the remaining 165 met the learning criterion. Among the 165 participants who successfully finished the learning phase (and therefore had 100% accuracy in their last no-feedback block), 116 (=70%) made no mistake, 30 (=18%) made one mistake, and 19 (=12%) made more than one mistake in their penultimate no-feedback block. Since the majority of the participants who met the learning criterion reached a stable optimal strategy (88% of them made one or no mistake in the last two nofeedback blocks), it is reasonable to think that they would have continued to perform optimally after completing the learning phase. Therefore, participants' data were filled in by iterating their last no-feedback block until block number 63. Note that with this fill-in method data from participants who met the learning criterion was completed with 100% accuracy. In Supplementary material C, we implement an alternative way of filling in the data by iterating participants' average performance across the last two no-feedback blocks. The results of this analysis are qualitatively the same as those described below, with the exception of the interaction between the three types  www.nature.com/scientificreports/ of order which turned out significant when using the alternative fill-in method. A probit transformation was applied to the dependent variable in order to meet the assumption of normality. Twenty observations among the 4536 available were found to be multivariate outliers and were excluded from the analysis (here, by observation we mean the performance obtained by a specific participant in a specific block). Note that the ANCOVA analysis was only run on no-feedback blocks. Block number was a significant predictor of participants' performance ( F(1, 5877) = 1886.13 , p < .0001 , η 2 p = .30 ). After controlling for block number, the main effect of within-category order was significant ( F(1, 246)  To further investigate the significant interactions, we conducted an analysis of simple main effects applying again the Benjamini-Hochberg correction. The simple main effect of within-category order was significant in both the constant ( p = .002 ) and variable ( p < .0001 ) across-blocks groups. Similarly, the simple main effect of across-blocks order was significant in both the rule-based ( p = .013 ) and similarity-based ( p < .0001 ) withincategory groups. The simple main effect of across-blocks order was significant in both the blocked ( p < .0001 ) www.nature.com/scientificreports/ and interleaved ( p = .006 ) between-category groups, whereas the simple main effect of between-category order was significant in the constant order ( p < .0001 ), but not in the variable one ( p = .6 ). Finally, the simple main effect of within-category order was significant in both the blocked ( p < .0001 ) and interleaved ( p < .0001 ) between-category groups, whereas the simple main effect of between-category order was significant in the similarity-based order ( p < .0001 ), but not in the rule-based one ( p = .09 ). These effects can be visualized in Fig. 5.
To complete the analysis, we assessed the difference in performance between the three-way ANCOVA with and without interactions. The F-test was significant ( p < .0001 ), showing that the model with interactions was significantly better than the model without interactions. To assess the robustness of our results, we applied the three-way ANCOVA multiple times by varying the block number until which the responses were completed (i.e., block number 48, 51, 55, and 81 corresponding to the 65%, 70%, 75%, and 85% quantile at which participants ended the learning phase). The results were qualitatively the same, although the interaction between withincategory and between-category orders was no longer significant when responses were completed until block number 48 ( p = .1 ) and block number 51 ( p = .08 ). For consistency purposes, our discussion will only focus on the two interactions that were found significant across every case (i.e., the interaction between within-category and across-blocks orders, and the one between between-category and across-blocks orders).
Transfer phase. Our next aim was to determine whether the types of order affected (i) performance on learning stimuli and (ii) generalization patterns on transfer stimuli during the transfer phase. Because we were interested in studying performance and generalization patterns in participants who learned the studied categories, the 18 participants who did not meet the learning criterion were excluded from the analyses. Data from 14 participants were additionally excluded because considered as outliers. Eleven of them pressed the key associated with Category B less than 10% of the trials, and the remaining three pressed no key on more than 15% of the trials. To recap, analyses of the transfer phase were performed on 184 participants. The distribution among the experimental conditions of the participants who were excluded is shown in Table 3. Figure 6 shows the percentage of correct responses for the learning stimuli presented during transfer, as a function of the types of order (taken separately). The percentage of correct responses was first computed for each participant and then averaged across participants. The two-sided Wilcoxon-Mann-Whitney test was performed to assess the difference in performance between the two conditions within each type of manipulation. Only the difference between constant versus variable across-blocks manipulations was significant ( p = .628 for rule-based vs. similarity-based, p = .673 for blocked vs. interleaved, p = .035 for constant vs. variable), indicating that participants in the constant condition learned the concept better than those in the variable condition. With regard to the within-category and between-category orders, no evidence was found suggesting that performance on learning stimuli was significantly impacted by the type of order. Figure 7 shows the average classification of the transfer items over the course of the transfer phase as a function of type of order (taken separately). Quantity p(A) is the observed proportion that each transfer item was classified into category A during transfer. To determine whether participants in different conditions applied different strategies for the classification of novel stimuli, we computed the distance of the observed generalization patterns to four specific strategies (distances were computed using the L1 metric and were normalized).

Analysis of generalization patterns on transfer stimuli.
We considered the following strategies: a rule-based strategy that uses Filling pattern (plain vs. striped stimuli) as main rule, a rule-based strategy that uses Shape (circles vs. squares) as main rule, a similarity-based strategy, and a random strategy. Participants adopting a rule-based strategy would classify new stimuli on the basis of the main rule (Filling pattern or Shape, depending on the chosen main rule), whereas participants adopting a similarity-based strategy would classify new stimuli on the basis of their similarity to the closest stored items. In the random strategy, novel stimuli would be randomly classified (50% of chance to classify them into category A). The rule-based strategy that uses Shape as main rule was included because Shape (as Filling pattern) allows participants to minimize the number of exceptions when used as the diagnostic dimension (see Sect. 2.5). For simplicity, the two rule-based strategies are called the filling pattern and shape rules, whereas the similarity-based strategy is called similarity strategy. Putative classification of the transfer stimuli for each of the above-mentioned strategies is shown in Fig. 8. Figure 9A shows the average distance of the observed generalization patterns to the above-mentioned strategies. The closest strategies to the observed generalization patterns are the similarity strategy and the filling pattern rule, followed by the random strategy, and finally by the shape rule. The two-sided Wilcoxon-Mann-Whitney test was performed to assess the difference in distribution between the different strategies. We found a significant difference between the shape rule and the random strategy ( p < .0001 ), and between the random strategy and the filling pattern rule ( p < .0001 ), but not between the filling pattern rule and the similarity strategy ( p = .16 ). Significance values were adjusted using the Benjamini-Hochberg correction with a FDR ≤ .05 . Since the shape rule and random strategy were the farthest to the observed patterns, they were excluded from the following analysis. Figures 9B, C, D show the average distance of participants' generalization patterns to two specific strategies (the filling pattern rule and the similarity strategy), as a function of the type of order within each manipulation (within-category in Fig. 9B, between-category in Fig. 9C, and across-blocks in Fig. 9D). While distances to the similarity strategy were similar across types of order within the same manipulation, distances to the filling pattern rule largely varied. More specifically, generalization patterns of participants in the rule-based order were significantly closer to the filling pattern rule than participant in the similarity-based order ( p = .034 ). The same applies to participants in the interleaved study as compared to those in the blocked study ( p = .004 ). Participants in the constant order were not significantly closer or farther from the filling pattern rule than those in the variable  To conduct a more thorough analysis of the transfer patterns, we partitioned the participants assigned to each type of order into groups, based on which strategy they best fit. For the partition, we considered all four strategies, and the best fit was determined by computing the L1 distance between the strategies and participants' generalization patterns. The strategy associated with the minimum distance was the best fit. When there were multiple best fits, all of them were taken into account (nine participants reported two best fits, and two participants three). Figure 10 shows the result of this partition in terms of number and percentage of participants, for within-category (Fig. 10A), between-category (Fig. 10B), and across-blocks (Fig. 10C) orders. Participants in the rule-based order most often used the filling pattern rule (42%), followed by the similarity strategy (30%), and the two remaining strategies (14% each). Inversely, participants in the similarity-based order most often used the similarity strategy (40%), followed by the filling pattern rule (30%), the shape rule (16%), and the random strategy (14%). Participants in the blocked order most often used the filling pattern rule and the similarity strategy (30% each), followed by the random strategy (20%), and the shape rule (18%). In the interleaved order, only a very small percentage of participants used either the shape rule (10%) or the random strategy (8%), while the remaining subjects used either the filling pattern rule (44%) or the similarity strategy (38%). With regard to the across-blocks manipulations, the partition was similar across both types of order. The two-sided Fisher's exact test of independence at level 0.05 found a significant difference in the partition of subjects between the blocked and interleaved orders ( p = .006 ), but not between the rule-based and similarity-based orders ( p = .39 ), nor between the constant and variable orders ( p = .93).

Discussion
Previous studies on category learning have shown that the sequence in which stimuli are encountered can profoundly influence learning speed and category formation 16,33,45 . However, the totality of these studies have only focused on a subset of the presentation orders (either within-category or between-category manipulations), ignoring potential interactions between different types of manipulations. Here, we manipulated within-category, between-category, and across-blocks order manipulations within a single task to study their separate and combined impact on learning speed and generalization patterns. Our main questions of interest are (i) whether manipulation of stimulus order across blocks influences how fast categories are learned, (ii) whether this type of manipulation interacts with other manipulations of order such as interleaving, blocking, grouping by rule, or grouping by similarity, and (iii) whether these experimental conditions facilitate the use of specific strategies on novel stimuli after the categories are learned. Table 3. Frequency and presentation order of the 32 participants who were excluded from the analyses of the transfer phase. Eighteen participants did not reach the learning criterion, plus 14 participants were excluded because considered as outliers (among the 14 participants, 11 pressed the key associated with Category B less than 10% of the trials, and three participants pressed no key on more than 15% of the trials.

Rule-based
Similarity-based  www.nature.com/scientificreports/ To address the first question, we used two survival analysis techniques to analyze the time required by participants to reach the learning criterion, as a function of the conditions within each order manipulation. Both the Kaplan-Meier survival curves and Cox proportional-hazards model showed that the pace at which categories are learned is influenced by the across-blocks order. More specifically, the constant order was found to lead to faster learning than the variable order. Moreover, the analysis of subjects' performance on learned stimuli during transfer showed better retention in the constant order as compared to the variable one.
To investigate whether the types of order interact in a more subtle fashion than what was revealed from the survival analyses, we performed a three-way ANCOVA with and without interactions. The between-subjects ANCOVA was used to analyze the interaction between within-category, between-category, and across-blocks orders on block-by-block performance after controlling for block number. Data was significantly best explained by the model with interactions, that found two significant interactions: one between within-category and acrossblocks orders and the other between across-block and between-category orders. A further investigation of the interactions showed that the combination S+V (similarity-based + variable) is particularly detrimental for learning, whereas the combination B+C (blocked + constant) benefits leaning. This nuanced pattern also appears (at least for the combination S+V) in the Kaplan-Meier and Cox analyses with the eight experimental conditions ( Fig. 3B and Fig. 4B). The ANCOVA also reported significant main effects for the three main manipulations. The fact that the ANCOVA reported a higher number of main effects than the survival analyses might be due to differences in the number of observations available (one learning time for the survival analyses vs. multiple block-by-block performance for ANCOVA).
To summarize, orders interact in a nuanced fashion that cannot be simply explained by aggregating the negative/positive effects of each type of order. The combination S+V amplified the detrimental effect of the variable order, making it the slowest condition. On the other hand, the combination B+C amplified the beneficial effect of the constant order. These nuanced interactions appear on top of the main effect of across-blocks manipulation, in which the constant order yields faster learning and better retention than the variable one.
The superiority of the constant across-blocks presentation over the variable one might be attributed to the limited amount of information carried by the constant order. Limiting the variability of the sequences might  Figure 9. Average distance of participants' generalization patterns on novel stimuli to specific strategies (rulebased strategy using Filling pattern, rule-based strategy using Shape, similarity-based strategy, and random strategy), for all participants (A) and for each type of order within the same manipulation (B, C, D). Distances were first computed for each participant before being averaged. The L1 norm was used and distances were normalized prior to averaging. Asterisks show the significance of the two-sided Wilcoxon-Mann-Whitney test. www.nature.com/scientificreports/ have helped participants focus on diagnostic information, enhancing the probability to either induct the simplest rule or memorize the category membership of the items. Another explanation is that memory can benefit from the repetition of the same sequences 48,49 , in particular to group items. A grouping process could have benefited the formation of rules.
In this study, we used the 5-4 category set that is often considered 'rule-based' in itself because of its use of discrete features. Therefore, our results most probably reflect the nature of this specific category structure. However, it is likely that more nuanced interactions should appear for other structures. For instance, an opposite effect could be hypothesized for information-integration category structures, where a rule based on binary decisions along dimensions is sub-optimal. For this type of structure, we might expect that the combination R+C would be the most detrimental, whereas the combination I+V would be the most beneficial. The bottom line is that order effects cannot be only analyzed in isolation, regardless of the category structure used in the task.
Our last set of analyses aimed at determining the influence of order manipulation on the classification of novel stimuli by analyzing the distance of the observed patterns to some main strategies. The rule-based strategy that uses Filling pattern as the main rule as well as the similarity-based strategy were found to be the closest to participants' generalization patterns. Moreover, distance to the filling pattern rule was significantly higher for participants in the similarity-based order (as compared to those in the rule-based order) and for participants in the blocked order (as compared to those in the interleaved order). These results show that both within-category and between-category orders affect how learning is transferred to novel stimuli, with the filling pattern rule   Figure 10. Partition of the participants in each main order type based on which strategy they most resemble. Graph (A) shows the within-category orders, graph (B) the between-category orders, and graph (C) the acrossblocks manipulations. When there were multiple closest strategies, all of them were taken into account. The closest strategy was computed using the L1 norm. www.nature.com/scientificreports/ being preferred more often in the rule-based and interleaved conditions than in the similarity-based and blocked conditions, respectively. The fact that the rule-based condition more clearly produces rule-based classification than the similaritybased condition has already been shown in 45 . Our results not only go in the same direction as in Mathy and Feldman, but also extend their findings to between-category orders. Moreover, our analysis relies on more nuanced transfer patterns than those used in this previous work. Indeed, instead of using whichever response was given more often in the five transfer blocks, we used the classification probability for each transfer stimulus.
Following the interpretation of Mathy and Feldman, the difference in strategy preference between rule-based and similarity-based learners might be attributed to the logic upon which the rule-based order is grounded. Presenting items following a "main rule (Filling pattern) plus exceptions" structure might have facilitated participants to infer the simplest rule, encouraging them to classify new items using the same inferred strategy. Conversely, the difference in preference between blocked and interleaved learners might find an explanation in the Sequential Attention Theory 72 . Alternating stimuli from different categories (interleaved study) leads to an attentional focus on properties that discriminate the categories, which might have promoted a rule-based transfer of the knowledge. By contrast, when stimuli from the same category are presented sequentially (blocked study), the encoding of the similarities among items of the same category is stronger, which might have promoted the use of a similarity-based strategy. In addition, the structure of the 5-4 category set (in particular the fact that the within-category similarity exceeds the between-category similarity) might also have played a role.
Finally, for a more thorough exploration of the transfer patterns, participants in each condition were partitioned based on the strategy they most resemble. Participants in the rule-based order preferred the filling pattern rule to the similarity strategy, whereas participants in the similarity-based order preferred the similarity strategy to the filling pattern rule. The blocked order promoted an almost equal use of the studied strategies, whereas the interleaved order almost exclusively promoted the use of the two most used strategies (i.e., the filling pattern rule and the similarity strategy).
An additional contribution of the present study is the promotion of underemployed statistical tools. A common practice in psychology is to remove participants who did not fulfill the objective of the task 45,73,74 . Nevertheless, unsuccessful participants can carry useful information. In the present study, we made use of two survival analysis techniques (the Kaplan-Meier survival curves and the Cox model) that allow us to account for individuals who did not complete the task. We advise the use of similar statistical tools when the conditions allow them.
Study limitations and perspectives. Although Medin and Shaffer's 5-4 category set has multiple benefits (see Introduction), it has the disadvantage of presenting a clear "rule plus exceptions" structure, which most likely benefited orders that facilitate rule acquisition. Future research should thus attempt to derive analogous results for other category structures. For instance, a potential perspective involves manipulating the same types of order in information-integration tasks to determine whether opposite effects stand.
Another limitation of this category structure is that it allows the co-existence of two different main rules that minimizes the number of exceptions (Filling pattern and Shape). Stimuli in the rule-based order were ordered following the "Filling pattern rule plus exceptions" structure. However, the dimension Shape could have equivalently been used instead of the dimension Filling pattern. In the same spirit, dimensions could have been instantiated by different features. For instance, Color could have distinguished the right and left objects within the cubes, Shape the objects at the front of the hypercube from those at the back, Size the objects at the top of the hypercube from those at the bottom, and Filling pattern the objects in the left cube from those in the right cube. We felt that multiplying our sample to consider all above-mentioned variations would have been too costly.
An additional perspective might involve using established computational models of category learning (such as SUSTAIN 75 ) and recently developed ordinal models (such as the SAT-M 76 or the OGCM 68 ) to further explore the impact of various order combinations. Moreover, numerical simulations in the spirit of 77 could be used as well to search for optimal order combinations.