Introduction

Human perception is not a sequence of snapshots of the outer world but a constructive process to cope with the ill-posed problems of vision. For example, we perceive a 3-D world but our retinae provide only 2-D representations. Hence, the third dimension has to be reconstructed.

fMRI studies have shown that activity in the primary visual cortex V1 is not just a retinotopic copy of the retinal stimulus but also reflects subjective perception. For example, bistable percepts lead to different V1 activities depending on the percept even though the stimulus is identical1. In intensity filling-in, a static object fades and disappears into the background. Activity in V1 decreases at the retinotopic location of the object during periods of subjective fading2. Another classical example of the subjective and constructive nature of vision is apparent motion. For example, a square is presented successively at two distinct spatial locations (figure 1A). As described by Korte's laws, if timing and distance are in a certain range, observers perceive one square moving smoothly from the first to the second location along the shortest path between these locations3,4. Hence, the square is perceived at intermediate locations where it was never physically displayed. Where in the human brain is the apparent motion component “added”? Recent fMRI studies have shown that the human middle temporal complex (hMT+) activates neurons in V1 corresponding to unstimulated locations on the trajectory of the “moving” square5,6,7.

Figure 1
figure 1

The 6 experimental conditions.

(A, lAM) linear apparent motion: a square is, first, presented for 150 ms at +5.9° on the x-axis and at +5.9° on the y-axis with respect to fixation (upper position). An individual ISI follows and then the square is presented again for 150 ms at +5.9° on the x-axis and −5.9° on the y-axis (lower position). (B, lRM) linear real motion: a square moves smoothly between the upper and lower position following a linear trajectory (black arrows on the figure depict smooth trajectories and were not shown on the simtulus). (C, Fli) flickering squares: as in the lAM condition, except that the two squares are presented simultaneously. (D, pgAM) path-guided apparent motion: the squares are presented as in lAM and in addition a faint cue is flashed during the ISI resulting in the percept of curved motion. (E, cRM) curved real motion: a square is moving smoothly between the upper and lower positions following a curved trajectory similar as in pgAM (black arrows on the figure depict smooth trajectories and were not shown on the simtulus). (F, Cue) cue: only the faint cue is presented. Real motion conditions were used to assess an upper activity level. The flickering squares condition served as a no-motion baseline.

How does the visual system select a particular apparent motion trajectory from infinitely many possible paths that link the two inducing flashes? A common observation about apparent motion is that its trajectory usually consists of the shortest path between two elements, such as the straight line between two squares5,7 or a curved line between oblique inducers6. This observation led to the hypothesis that the human brain imposes a “minimal energy” constraint in determining the interpolated trajectory of apparent motion. One possibility is that the minimal energy constraint is realized by (hard) neural wiring in V1, e.g., by strong connections between neurons corresponding to shortest paths.

Notwithstanding the common observation of shortest path trajectory, apparent motion is rather “flexible” in that it can follow also non-minimal energy paths. In path-guided apparent motion paradigm8, flexible motion paths can be created by the presentation of a short, almost invisible cue in addition to, for example, the squares. In order to test the generality of the relationship between V1 activity and perceived motion trajectories, here, we have chosen a curved cue so as to generate a curved, rather than the minimal linear path, for the perceived trajectory of apparent motion. If V1 activity is truly the basis for subjective apparent motion, rather than an epiphenomenon of shortest path connectivity, we expect that neural activity should also reflect filling-in of this curved non-minimal path.

Using fMRI, we compared activity within two retinotopically mapped regions of interest (ROIs), corresponding to two different possible apparent motion trajectories: linear and curved. Six different experimental conditions were presented: 1) linear apparent motion, 2) linear real motion, 3) flickering object, 4) curved apparent motion, 5) curved real motion and 6) flickering cue (details in Methods section).

Results

Beta estimates as a function of stimulus type and ROI are shown in Fig. 2I-J-K. A two-way repeated-measures ANOVA, with stimulus type and ROI as factors, shows a significant main effect of stimulus type (F5,40 = 3.36, p = 0.012) and there is a significant interaction (F10,80 = 10.15, p < 0.001). Although the cortical volumes of the ROIs were significantly different as shown in Fig. 2H (one-way repeated measures ANOVA, main effect: F2,16 = 14.5, p < 0.001), there was no overall difference in activity between the different ROIs (F2,16 = 2.92, p = 0.08). Therefore, the presented results cannot be explained by the differences in eccentricity of the mapped ROIs.

Figure 2
figure 2

Mapping stimuli, Regions of interest (ROIs) and Results.

Three ROIs were defined using flickering checkerboard corresponding to the path of curved motion (A) and the path of linear motion (B). In addition, we defined a control ROI as the mirror image of the curved path (C). The retinotopic maps produced by the retinotopic mapping procedure (D–E). The cortical representations of the three ROIs are shown on the flattened cortex of subject GJ (F–G). The ROI shown in red corresponds to the path-guided motion, the ROI shown in blue corresponds to the linear motion and the ROI shown in green corresponds to the control region. The average volume of each ROI is shown in H. (I) Beta estimates within the ROI of the linear path: lAM activity is statistically higher than in the Flicker condition (t8 = 2.42, p = 0.042). (J) Beta estimates within the ROI of path-guided motion: pgAM activity is statistically higher than Flicker activity (t8 = 3.84, p = 0.005), Cue activity (t8 = 2.36, p = 0.046) and lAM activity (t8 = 2.94, p = 0.019). (K) Changes in BOLD signal within the control ROI: activity is not significant for all conditions. Vertical bars represent the standard error of the mean and stars highlight beta estimates significantly different from 0.

In detail, our results, first, show minimal energy path filling-in in the apparent motion condition lAM, replicating previous findings5,6,7. Activity within the linear path ROI was significantly higher in the lAM compared to the Flicker condition as shown by a paired t-test (t8 = 2.42, p = 0.042; figure 2I and table 1). In addition, activity along the linear path was very similar in the apparent motion (lAM) compared to the real motion (lRM) condition (t8 = −0.13, p = 0.9; figure 2I and table 1).

Table 1 Pre-defined statistical comparisons.

Second, we found similar results for the path-guided and the linear conditions. Activity within the ROI, corresponding to the curved path, was significantly higher in the pgAM than in the Flicker condition (t8 = 3.84, p = 0.005, figure 2J and table 1). This activity is not caused by the cue. Activity within the curved path ROI was higher in the pgAM than in the Cue condition (t8 = 2.36, p = 0.046; figure 2J and table 1). Furthermore, activity within the curved path ROI was significantly higher in the pgAM condition than in the lAM condition (t8 = 2.94, p = 0.019, figure 2J and table 1).

Comparing the linear path and curved path ROIs (Fig. 2I vs 2J), activity in the linear path ROI is less selective to apparent motion stimulus than the curved path ROI. For the linear path ROI, both apparent motion stimulus generate significant activities as shown by one sample t-tests (plAM = 0.005, ppgAM = 0.02), while in the curved path ROI only the corresponding pgAM condition elicit significant activity (plAM = 0.53, ppgAM < 0.001). In addition, a two-way repeated-measures ANOVA is computed on the beta values within the linear path and curved path ROIs for conditions lAM and pgAM. There is no main effect of ROI (F1,8 = 4.75, p = 0.06) and condition (F1,8 = 1.15, p = 0.32), but there is a significant interaction between the two (F1,8 = 28.1, p < 0.001), thus highlighting the double dissociation between the two apparent motion stimuli and their corresponding linear and curved paths of motion.

We also investigated the beta estimates for the different experimental conditions within a control ROI. We expected low activity for all conditions since there was no physical stimulation within the ROIs. In addition, no apparent motion was perceived as reported by all subjects. The results are in accordance with our predictions (figure 2K).

Discussion

Percepts are not images of the outer world. Percepts can strongly differ from external world stimulation. A classic example is apparent motion where one element is presented successively at two different locations. Observers perceive a smooth and compelling motion between the two locations instead of two static flashes. Where are the neural correlates of this subjective experience in the human brain?

Previous work has shown that neurons in V1 are active when their receptive fields correspond to locations on the trajectory of apparent motion, even though there is no stimulation at these locations. When elements are flickered, instead of presented successively, there is no activation, ruling out the possibility that lateral spread of neural activity has caused this activity5,6,7. We could well reproduce these findings (figure 2I). Is, hence, V1 activity reflecting subjective experience? If this is true, then, V1 activity should retinotopically represent all sorts of apparent motion trajectories in a flexible manner. This is indeed what we found.

We used a path-guided apparent motion paradigm8 in which a cue determines the perceived motion trajectory. We used a cue that creates the perception of a curved trajectory between two squares rather than the linear, the minimal energy path. We found that V1 activity “followed” the curved path (figure 2J). In addition, activity in the apparent motion condition was very similar to activity when the square moved in real curved motion (figure 2I). Enhanced activity in the path-guided apparent motion condition is not caused by the faint cue since activity within the corresponding ROI was very weak and much weaker than in the path-guided apparent motion condition pgAM (figure 2J).

Subjective experience is commonly thought to occur at higher stages of visual processing9. In accordance with previous studies5,6,7, here, we found further evidence that subjective experience can correlate with activity as early as in V1. Furthermore it was shown that activity in V1 is modulated by the spatio-temporal context of illusory motion10, supporting the idea that subjective experience is reflected in early visual areas. However, we do not claim that these correlates constitute the causal basis for subjective experience. First, it might be that recurrent activity changes V1 patterns, which are not related to conscious experience. Effects of recurrent activity on V1 neural firing were found in figure-ground-segmentation11,12, 3-D perception13 and in the aperture problem of motion perception14. Second and more importantly, our findings show that both the linear- and curved-path ROIs are activated by more than one stimulus configuration and that the selectivity in the linear-path ROI to stimulus configurations is less than that of the curved-path ROI. For instance, the pgAM stimulus generates the percept of motion along the curved-path but it elicits activity both in the linear-path ROI and the curved-path ROI, but not in the control ROI. Hence, there is no one-to-one correspondence between the percept and the retinotopic activity. Thus, the activity in V1 may be viewed as representing a small subset of infinitely many possible solutions for ambiguous stimuli. From this perspective, that linear-path ROI was activated by more stimulus configurations than the curved-path ROI may represent the neural correlate of the primacy of the minimal path/energy constraint in apparent motion. If indeed V1 activity represents simultaneously a small subset of solutions to ambiguous stimuli, one can speculate that activities in higher cortical levels are the ones determining the prevailing percept.

Methods

Participants

Nine healthy participants (4 male) from the University of Houston participated in the experiments. All experiments were conducted according to a protocol approved by the University of Houston Committee for the Protection of Human Subjects, in accordance with the federal regulations, 45 CFR 46, the ethical principles established by the Belmont Report and the principles expressed in the Declaration of Helsinki. Participants provided written informed consent approved by the University of Houston Committee for the Protection of Human Subjects. Mean age was 31.1 years (±3.26) ranging from 22 to 53 years. All participants had normal or corrected to normal vision.

Visual presentation

Stimuli were designed with the Adobe Creative Suite 5 Design Premium. Stimuli were displayed with a NEC NP200 projector (brightness 2100 lumens, contrast ratio 1300:1, horizontal sync rate 31.35–68.7 kHz, vertical sync rate 56–85 Hz, resolution 1024 × 768) on a rear projection screen and viewed by subjects via an adjustable mirror fixed to the headcoil from a distance of 95 cm.

Similarly to Muckli et al.5, a white square of a size of 1.7° was presented on a grey background on the right side of the screen (the Weber contrast was 1.17). A blue fixation cross was presented centrally during the whole stimulation presentation (the Weber contrast was 0.76). There were six experimental conditions (figure 1A–F):

  1. 1

    Linear apparent motion (lAM, figure 1A). First, the square was presented for 150 ms at 5.9° to the right and 5.9° above fixation. An inter stimulus interval (ISI) followed and then the square was presented at 5.9° to the right and 5.9° below fixation. The ISI was adjusted individually for each subject to optimize smooth motion sensation (for 6 subjects 70 ms, for 1 subject 50 ms and for 1 subject 90 ms). Presentation frequencies ranged from 2.08 to 2.78 Hz.

  2. 2

    Linear real motion (lRM, figure 1B). A square was presented smoothly moving vertically between the upper start and lower end points. For each frame, the square moved by 0.83° resulting in a constant speed of 42.5°/sec (figure 2B). The lRM condition is used as a baseline condition to obtain an upper V1 activity limit.

  3. 3

    Flickering squares (Flicker, figure 1C). The two squares appeared and disappeared synchronously at the upper and lower positions, respectively. The flicker rate ranged from 2.08 to 2.78 Hz corresponding to the individual ISIs of observers in the lAM condition. The Flicker condition served as a no-motion baseline condition.

  4. 4

    Path-guided apparent motion (pgAM, figure 1D). The squares were presented for 150 ms at the upper and lower positions as in the lAM condition. The individual ISI duration of each observer was used. In addition, a very dim grey cue was presented at half of the ISI for 10 ms with a Weber contrast of 0.0043. The cue was a curve with a radius of 9.1° ranging from the upper to the lower position (figure 1D). The cue induced an apparent motion percept along a curved trajectory.

  5. 5

    Curved real motion (cRM, figure 1E). The square moved from the upper to the lower position along a trajectory corresponding to the pgAM path with a radius of 9.1°.

  6. 6

    Flickering luminance cue (Cue, figure 1F). Only the cue of the pgAM condition was presented, i.e., no squares. The timing was identical to the pgAM condition. This condition was used to determine cue induced activity, if any.

In all conditions, stimuli were presented for 14 s. For example, in the lRM condition, the square moved 29 times back and forth. In the lAM condition the square was presented 29 to 35 times at the upper and lower positions according to individual ISIs. Subjects were instructed to keep fixation on the blue cross.

After the individual ISI was set and prior to data acquisition, all conditions were presented once to the participants to ensure that the correct percepts were induced (linear motion for lRM and lAM, curved motion for cRM and pgAM, no motion for Fli and Cue). To ensure that attention to stimuli was reliably maintained, a detection task was implemented. The blue fixation cross would turn red for 250 ms at various occasions and subjects were instructed to report the color change by pressing a remote button with their right hand. All subjects could reliably detect the color change.

Data acquisition

Functional images were collected with a SIEMENS I-class MAGNETOM 3T Trio machine at the Human Neuroimaging Laboratory of the Baylor College of Medicine, Houston. Each subject underwent a T1-weighted anatomical scan and a standard retinotopic mapping procedure15 to identify the borders between the early visual areas. Height concentric rings with growing radius (0.45°, 0.85°, 1.43°, 2.3°, 3.4°, 4.9°, 6.9° and 10°) and filled with flickering checkerboard patterns were successively presented during 32 s, the sequence was repeated 4 times. Eccentricity maps were computed to localize the cortical representation of the fovea. A rotating wedge continuously moving in clockwise direction was displayed with 4 complete cycle of 32 s. Polar angle mapping was then used to identify the borders of V1. An example of the retinotopic maps for a representative subject is presented in figure 2D–E. Next, a mapping of regions of interest (ROIs) determined subregions of V1. Finally functional images were collected during presentation of the 6 experimental conditions.

For the structural scan, a total of 192 slices were acquired with a voxel's size of 1.0 × 1.0 × 1.0 mm3, a TR of 1200 ms, a TE of 2.66 ms, a flip angle of 12° and a 256 × 256 acquisition matrix.

For the functional scans, 37 slices per volume were acquired with a voxel's size of 3.4 × 3.4 × 4.0 mm3, a TR of 2000 ms, a TE of 25 ms, a flip angle of 90° and a 64 × 64 acquisition matrix.

During the ROI mapping procedure, each stimulus was presented for 32 s with an interleaved resting period of 12 s. During the experimental conditions, two blocks of six conditions were presented. Each stimulus was displayed for 14 seconds with an interleaved resting period of 12 s.

Mapping of regions of interest

Three ROIs were mapped independently using a set of static stimuli filled with checkerboard patterns flickering at 5 Hz. The strategy was to identify which subregions of V1 represented the different possible paths of motion, according to its retinotopic organization. A region representing the path of linear motion (figure 2A) and a region representing the path of curved motion (figure 2B) were mapped. A control region corresponding to the mirror image of the curved path was also mapped (figure 2C).

The ROI mapping procedure allowed identifying clusters of voxels for each ROI. An example of the 3 ROIs identified for a representative subject is presented in figure 2F. Each patch is composed of neurons sensitive only to a given part of visual space. Spatial relations between the mapped elements are well preserved in V1.

Analysis

Data were analyzed with the SPM8 software (Wellcome Department of Cognitive Neurology, London, UK). Functional data were preprocessed using slice-timing correction, accounting for differences in acquisition time between slices, realignment for motion correction accounting for subjects head motion (breathing, uncontrolled movements) and coregistration with structural data.

A GLM analysis was used to identify ROI's with regressors built according to the sequence of mapping stimuli and convolved with the hemodynamic response filter. All ROIs were defined using a “winner takes all” approach. First an F-contrast across all 3 mapping conditions was computed to identify all the voxels activated by the stimulation of at least one mapped region (V1 mask). Then voxels within the V1 mask were labeled as representing the ROI demonstrating the highest t-score for that particular voxel. ROIs were computed with an uncorrected p-value of 0.001 or less.

A second GLM analysis was carried on the experimental data. For each subject, the Beta estimates were averaged across the mapped ROIs. Finally, the mean Beta estimates were averaged across all subjects.

For representation purpose, the data from a representative subject were analyzed with BrainVoyager QX with a similar processing pipeline.