Disentangling locus of perceptual learning in the visual hierarchy of motion processing

Visual perceptual learning (VPL) can lead to long-lasting perceptual improvements. One of the central topics in VPL studies is the locus of plasticity in the visual processing hierarchy. Here, we tackled this question in the context of motion processing. We took advantage of an established transition from component-dependent representations at the earliest level to pattern-dependent representations at the middle-level of cortical motion processing. Two groups of participants were trained on the same motion direction identification task using either grating or plaid stimuli. A set of pre- and post-training tests was used to determine the degree of learning specificity and generalizability. This approach allowed us to disentangle contributions from different levels of processing stages to behavioral improvements. We observed a complete bi-directional transfer of learning between component and pattern stimuli that moved to the same directions, indicating learning-induced plasticity associated with intermediate levels of motion processing. Moreover, we found that motion VPL is specific to the trained stimulus direction, speed, size, and contrast, diminishing the possibility of non-sensory decision-level enhancements. Taken together, these results indicate that, at least for the type of stimuli and the task used here, motion VPL most likely alters visual computation associated with signals at the middle stage of motion processing.

A large body of evidence has shown that the human visual system can gain long-lasting 57 perceptual improvements following several sessions of perceptual training. This 58 phenomenon, called visual perceptual learning (VPL), has been an active area of research 59 because VPL is a remarkable demonstration that human vision can remain plastic even in 60 adulthood 1,2 . Numerous studies have revealed training-induced perceptual improvements 61 on a wide range of visual tasks, including low-level contrast and orientation 62 discrimination tasks 3,4,5,6 , mid-level motion and form tasks 7,8,9 and even high-level 63 object and face recognition tasks 10,11 . 64 While the robustness of learning effects is well established, debate persists with 65 respect to the mechanisms underlying VPL. Early psychophysical work found that 66 learning effects are usually confined to the trained parameters 6,12 . Such strong specificity 67 suggests that VPL most likely takes place within low-level visual areas (e.g., V1 or V2) 68 since neurons therein exhibit narrow ranges of spatial and feature selectivity (e.g., 69 orientation, motion direction). Recent evidence, however, challenges this idea by 70 revealing an increasing number of cases where the transfer of VPL is viable to novel 71 stimulus conditions and tasks 13,14 . This is consistent with an involvement of higher-level 72 visual areas, wherein neurons usually respond to larger spatial areas and more complex 73 stimulus features. Some studies even suggest the contributions from the brain areas that 74 process non-sensory attributes. For instance, perceptual learning might manifest as the 75 change of decision variables encoded in the prefrontal cortex 15 . Alternatively, perceptual 76 learning might facilitate encoding of abstract concepts representing basic visual features 77 (e.g., orientation and contrast) 16 or lead to a better set of task-specific rules 17 . Given that 78 these theories postulate changes beyond canonical sensory mechanisms, we refer to them 79 as 'non-sensory' learning processes. 80 The task of linking VPL to specific brain areas is complicated by the complex 81 functional specializations of the brain. The brain includes multiple brain regions that are 82 organized into a coarse, but richly interconnected hierarchy 18,19 . Even a simple 83 perceptual choice likely arises from the interplay among multiple brain regions. One 84 strategy is to take advantage of visual processes where links between behavior and neural 85 structures are well established. Here, we focus on VPL in context of motion perception, a 86 perceptual domain where we have a relatively advanced understanding of different 87 processing stages 20 . In primates, neurons selective to motion direction first occur in the 88 earliest cortical areas V1 and V2 21 . However, conscious motion perception is most 89 closely linked to intermediate visual areas, such as MT and V3A. These areas contain a 90 large portion of neurons showing strong preferential responses to different motion 91 directions 22,23,24,25 . In addition, perceptual decisions based on motion stimuli have been 92 linked to several higher-level brain areas (e.g., lateral intraparietal cortex (LIP) and 93 prefrontal cortex). These areas are often ascribed as "evidence accumulators" that 94 integrate sensory information provided by the upstream motion processing units in order 95 to form perceptual decisions and guide visual behaviors 26,27 (but see ref. 28 ). Finally, non-96 sensory attributes, such as task rules and decision strategies, encoded in high-level 97 cognitive areas, can also mediate performance in motion perception tasks 29 . This 98 complex hierarchy can be operationalized as a symbolic three-layer network ( Figure 1). 99 This network consists of a low-level (e.g., V1/V2), a middle level (e.g., MT/V3A) and a 100 high-level (e.g., LIP, prefrontal cortex) processing stage. 101 In contrast with the established understanding of visual motion processing stages, 102 their role in human VPL is largely unknown. To address this question, we took advantage and non-sensory high-level cognitive processes. 127 128

Participants and apparatus 130
Fourteen undergraduate students from University of Rochester (18 to 22 years old, 5 131 males and 9 females) took part in this study. All participants had normal or corrected-to-  and post-test conditions. Participants viewed a moving stimulus that was 144 either a grating or a plaid (arrows are for illustration purposes only).

145
Stimulus duration varied on each trial, as determined by two interleaved 146 staircases. Participants indicated the perceived stimulus direction via 147 button press (e.g., left vs. right in this case). 148 149

Stimulus and task settings 150
Participants were randomly assigned into two groups -one group trained on component 151 motion (grating; N = 8) and another group trained on pattern motion stimuli (plaid; N = 152 6). All participants were tested and trained on a two-alternative forced choice motion 153 direction identification task (Figure 2), reporting the perceived stimulus motion direction 154 via key press. Auditory feedback was provided after each trial during the training phase 155 but not at pre-/post-test (to minimize learning effects in pre-/post-test). To facilitate 156 fixation, we used the following fixation sequence ( Figure 2): a fixation circle (0.8° 157 radius) appeared after each key press response and, the circle shrank to 0.13° over 200 158 ms, remained at that size for 360 ms, and then disappeared 360 ms before stimulus onset. 159 We found in our previous work that this dynamic fixation sequence was very effective in 160 guiding eye gaze to the center of the screen before the stimulus onset 34 . The inter-trial 161 interval was 1000 ms. 162

Dynamic fxation 560ms
Blank 360ms Stimulus Response ITI 1000ms Time As detailed below, the two training groups used partially overlapping sets of pre-163 and post-test conditions. We selected this design to limit pre-and post-test sessions to 164 only the most diagnostic test conditions for each group. This allowed us to test the 165 bidirectional transfer between component and pattern motion, as well as the dependency 166 of learning transfer on several key low-level stimulus features. 167 In the component-training group, the training stimulus was a grating (contrast = 168 50%, drift speed = 4°/s, radius = 8°, 2D raised cosine spatial envelope; spatial frequency 169  To estimate duration thresholds for each pre-and post-test condition, we fit 229 Weibull psychometric functions to 160 trials of raw data using the maximum likelihood 230 method, estimating the thresholds at 82% correct. The amount of learning in each 231 condition was estimated by computing percent of improvement (PI): 232 PI = threshold pre − threshold post threshold pre *100% (1)

233
where threshold pre and threshold post indicate duration thresholds for the corresponding 234 pre-and post-test stimulus conditions. We used paired t-tests for comparisons of pre-and 235 post-test thresholds and for comparison of PI across stimulus conditions. One-sample t-236 tests were used for assessing the statistical significance of PI against the null hypothesis 237 The main focus of this paper is to examine the transfer of perceptual learning to a range 261 of diagnostic stimulus conditions. A two-stage criterion was used to assess transfer of 262 learning. First, we concluded that learning transfers to a stimulus condition if the pre-263 /post-test difference on this condition was statistically significant. If a stimulus condition 264 passed this first test, then we compared its PI to the corresponding trained condition (i.e., 265 either trained component or trained plaid). If the transfer PI was significantly smaller than 266 the trained PI, the result was described as a "partial transfer". Alternatively, if the PI for a 267 transfer condition was not statistically smaller than the PI for its corresponding trained 268 condition, we referred to it as "complete transfer", according to an established convention 269 in VPL research 13,16,17 . 270 The key aim of this study was to determine whether perceptual training leads to

Specificities to direction, speed, size, and contrast 327
We have thus far focused on experimentally disentangling component-dependent from 328 pattern-dependent VPL, with the results arguing against low-level component-dependent 329 VPL. What remains unclear, however, is whether the perceptual training led to 330 enhancements in the processing of sensory features or high-level non-sensory attributes. 331 For instance, participants might learn motion directions as abstract concepts 16 or be more 332 familiar with the general task statistics (e.g., stimulus timing, stimulus-response 333 association 17 ). In this case, plasticity takes place in higher brain hierarchy that is 334 independent of the sensory processing. To further delineate the plasticity in the sensory 335 ( Figure 6A-B) or the non-sensory processing ( Figure 6C-D), we examined the tolerance 336 of our training across several other forms of stimulus variations, i.e., direction, speed, 337 size, and contrast. The prediction is that if the plasticity is largely limited to sensory 338 processing, learning should be confined to the trained stimuli; otherwise learning effects 339 will transfer irrespective of the variations in other stimulus features. 340 The results indicated a notable specificity to stimulus variations. In the 341 component-training group, we did not find significant transfer for trained and test stimuli 342 that differed in motion directions ( Figure 6E left panel, pre-/post-test, t(7) = 1.886, p = 343 0.101; Figure 6F left panel, PI, t(7) = 2.016, p = 0.084). We also found no significant 344 transfer to test stimuli that have smaller size ( Figure 6E left panel, pre-/post-test, t(7) = 345 1.308, p = 0.232; Figure 6F left panel, PI, t(7) = 1.376, p = 0.211) or lower contrast 346 ( Figure 6E left panel, pre-/post-test, t(7) = 2.187, p = 0.065; Figure 6F   Taken together, we find that motion VPL is specific to stimulus direction, speed, 357 size, and contrast. These results demonstrate that our training has strong susceptibilities 358 to variations in basic visual features. Such strong dependencies indicate that a broadly 359 tuned non-sensory learning mechanism unlikely plays an important role in observed 360 learning because it predicts a broad transfer over variations in low-level stimulus features. 361 Note that we cannot completely eliminate the possibility of changes in sensory readout 362 mechanisms since, theoretically, a refined readout mechanism can be sensitive to changes and argue against plasticity in high-level brain areas that represent non-sensory cognitive 398 factors, such as general task statistics and decision rules 15 LIP, but minimal changes in neural activities in area MT. This study advocates a 434 mechanism beyond the sensory-representation level, where training results in a more 435 efficient extraction of useful sensory information rather than in an enhancement of 436 sensory representations per se. In contrast, recent fMRI studies found that motion VPL 437 refines the cortical tuning of the human MT, emphasizing the pivotal role of enhancement 438 at sensory-representation level 48,49 . Notably, the mechanistic role of high-level cognitive 439 influences in sensory processing is still largely unknown. Previous studies have suggested 440 at least two broad categories, mechanisms that are sensory (e.g., selective readout) and 441 those that are non-sensory (e.g., conceptual learning, rule-based learning). While 442 disentangling between these higher level processes is beyond the scope of this paper, the 443 observed specificity to basic stimulus features argues against non-sensory cognitive 444

factors. 445
What are the possible neural underpinnings of the observed empirical findings in 446 the present work? We surmise that several mechanisms may coexist and interact. First, 447 because training on a plaid motion stimulus does not fully transfer to its two components 448 ( Figure 5E), we conclude that a significant part of the relevant plasticity occurs 449 downstream from the low-level motion mechanisms. Given the evidence that MT neurons 450 analyze pattern motion by selectively integrating inputs from a population of V1 neurons 451 38 , one possible mechanism is that learning improves information transmission from the 452 low-level to the middle-level motion processing. Such a mechanism is consistent with 453 findings of a recent study where attention was shown to improve the amount of 454 information transferred from V1 to hMT+ 50 . Moreover, learning effects in our study are 455 specific to direction, speed, contrast, and size, indicating critical roles of neuronal tuning 456 to these low-level visual features. For example, stimulus contrast and size have strong 457 influences on neural responses in motion processing 51 . This is also in line with our 458 previous findings showing that motion perception is strongly modulated by stimulus 459 contrast and size 52,53 -behavioral findings that have been linked to mechanisms within 460 area MT 54,55 . 461 In summary, our study provides evidence for the training-induced plasticity in the 462 intermediate stage of motion processing, and highlights the significance of basic motion-463 related visual attributes in mediating the transfer of motion VPL. 464