Main

The brain is not a video camera; it does not provide an exact replay of experienced events. Rather, memories for specific events are constructed at the time of retrieval, which makes the process of remembering prone to errors. The study of such errors can provide insights into the inner workings of memory. Particularly instructive are reality-monitoring errors, in which an event that was only imagined is remembered as if it had actually happened1. One way to raise the probability that a reality-monitoring error will occur is to increase the perceptual vividness of an imagined event. Memories with vivid perceptual information tend to be attributed to perceptual experiences, whereas memories that are perceptually vague or dim are more likely to be attributed to imagined experiences. In one experiment, for example, greater imagery of a concocted childhood event increased the likelihood that participants would later believe the event had actually happened2. In other experiments, encoding words with an emphasis on imagery increased the number of errors when people later tried to remember whether particular items were seen as words or pictures3,4.

A plausible account of the neural events responsible for these reality-monitoring errors is emerging from neuroimaging studies. First, visual imagery activates many of the same brain areas active during visual perception5,6,7, which suggests that images and percepts can have overlapping memory representations in the cerebral cortex. Particularly vivid visual imagery of an event might lead to the formation of a memory trace that is indistinguishable from the trace that would have been formed from actual perception of the event. This idea is consistent with accounts of episodic memory that claim that memory traces consist of distributed networks in cortical regions involved in initially encoding an event, and that these networks are re-activated when the memory is retrieved8,9,10.

Several studies have found differences in the neural correlates of true and false memories. These experiments used a procedure in which participants studied lists of strong semantic associates of nonpresented words termed 'critical lures'11,12. When tested in this manner, subjects falsely recall or recognize many of the critical lures as study-phase items. Differential neural activity associated with true versus false memory in this procedure has been observed with positron emission tomography (PET)13, functional magnetic resonance imaging (fMRI)14 and event-related potentials (ERPs15,16). The most common effects were found only when true and false memories occurred in a blocked testing format, and these effects probably reflect post-retrieval monitoring processes subserved by prefrontal cortex. However, retrieval itself may not be experienced in an identical way for true and false memories, given that veridical memories tend to be rated as having more perceptual detail17,18. Yet, it is unclear which brain activity is responsible for phenomenological differences between true and false memories.

To investigate how perceptual aspects of true and false memories may differ, reality-monitoring protocols may prove more useful than protocols based on lists of strong semantic associates. The latter are effective because critical lures receive semantic activation from their associates before the test phase. Reality-monitoring errors are a more prevalent form of memory distortion in everyday life, and they arise as a consequence of the similarity between the encoding of imagined and perceived events. Accordingly, we developed a protocol for generating reality-monitoring errors, and used the ERP method to provide a moment-by-moment record of relevant neurophysiological activity at encoding and retrieval19. Furthermore, we built on our previous observation of an ERP correlate of visual imagery during memory retrieval20. Participants in this previous experiment studied sets of spoken words either with or without using visual imagery, and they later recognized words encoded with visual imagery better than those encoded without imagery. ERPs from occipital scalp locations differed for these two types of words, and these differences were interpreted as reflections of the enhanced retrieval of visual object representations for words encoded using visual imagery.

We reasoned that the same sort of occipital ERPs could provide markers of relevant retrieval events in the present experiment. Participants viewed words during a study phase (Fig. 1) and mentally generated a visual image of the object corresponding to each word. For half the words, a color photograph of the object was also shown. During a recognition test, participants listened to spoken words and, for each one, decided whether it corresponded to an object seen during the study phase. This procedure induced participants to confuse memories for imagined pictures with memories for perceived pictures. Because visual images were generated for all studied words, multiple perceptual and cognitive factors were held constant between the two study conditions (word-only and word plus picture). Using spoken words during testing was advantageous because the absence of visual processing requirements freed up resources for visual imagery20,21. We hypothesized that accurate memories for pictures would be associated with more perceptual detail than would false picture memories, and that neural activity recorded from the scalp could implicate such memory-related imagery.

Figure 1: Experimental procedure.
figure 1

In the study phase, words, pictures and filler rectangles appeared for 300 ms each, at a constant rate. Four trials are shown: word plus picture, word only, word only and word plus picture. In the test phase, words were presented at a constant rate, but the duration of each spoken word varied from 240–690 ms (mean, 475 ms).

In addition, ERPs recorded during the study phase were computed according to whether or not items were later recognized. Such effects, often termed 'Dm effects' as a short-hand for ERP 'differences' based on later 'memory' performance, have been observed in numerous studies22,23,24,25,26,27,28. Typically, study-phase items that are accurately remembered elicit a more positive ERP than do forgotten items. Here we expected that vivid visual images for word-only trials in the study phase would tend to cause the corresponding items to be falsely remembered as pictures during the memory test. Accordingly, occipital ERPs were also used to index visual imagery at encoding.

Results

Behavior

The mean response time (RT) to words in the study phase was 1133 ± 43 ms (mean ± s.e.m.), and 97% of these responses were registered before the subsequent appearance of the picture (word plus picture trials) or rectangle (word-only trials). Behavioral results from the test phase (Table 1) showed the rate of false memory occurrence was dramatically increased for word-only items; there were significantly more 'yes' responses for old word-only items than for new items (t11 = 8.73, p < 0.001).

Table 1 Behavioral data in the memory test.

ERPs

Subjects' responses to word-only items in the study phase were classified as 'later false memories' if subjects later incorrectly remembered having seen an item as a picture, or 'later correct rejections' if subjects correctly remembered only hearing the word. The resulting ERPs (Fig. 2a) differed at the latency of a positive deflection at parietal and occipital electrode sites (Fig. 3a). To test the reliability of this effect, we measured mean amplitudes in the two conditions at midline parietal and occipital sites from 600–900 ms and compared them in a repeated-measures analysis of variance (ANOVA), with condition and electrode as factors. ERP differences averaged 0.91 μV (occipital) and 1.16 μV (parietal), and were significantly more positive for later false memories than for later correct rejections (F1,10 = 7.13, p = 0.024). The condition by electrode interaction was not significant.

Figure 2: ERPs for the two main comparisons.
figure 2

Voltage is in microvolts, as a function of time in milliseconds (time 0, word onset). Electrode locations are the midline sites Fpz, Fz, Cz, Pz and Oz. Arrows, significant differences from primary analyses. (a) Study-phase ERPs in response to word-only items, averaged as a function of whether items were later falsely remembered as pictures or later correctly rejected (mean number of trials after artifact rejection, 91 and 39, respectively; n = 11). (b) Test-phase ERPs in response to accurate picture memories and false memories, matched for response time (mean number of trials after artifact rejection, 34 and 33, respectively; n = 12).

Figure 3: Topographic maps of the primary study-phase and test-phase effects.
figure 3

Statistical comparisons of mean ERP amplitudes at each electrode site were used to generate t-maps with spherical spline interpolation on a schematic view of the head from above. (a) Study-phase topography showing t-values from the comparison of later false memories to later correct rejections from 600–900 ms. Positive t-values > 2.23 (uncorrected) indicate regions where ERPs in response to later false memories were significantly more positive than ERPs in response to later correct rejections. (b) Test-phase topography showing t-values from the comparison of accurate picture memories to false memories from 900–1200 ms, RT - matched. Positive t-values > 2.20 (uncorrected) indicate regions where ERPs to accurate picture memories were significantly more positive than ERPs to false memories.

For word plus picture study-phase trials, ERPs tended to be more positive for items remembered later (1.15 μV, occipital; 1.01 μV, parietal), although these differences were not statistically significant. In contrast, ERPs in response to pictures were significantly more positive for pictures that were remembered (1.44 μV, occipital; 2.38 μV, parietal; F1,10 = 17.61, p = 0.002). This pattern of results is understandable given that the memory test was for the pictures, not the words.

For the test phase, we compared ERPs in response to accurate picture memories to ERPs in response to false memories matched for response time (Methods; Fig. 2b). Accurate picture memories showed an enhanced positivity relative to the response for false memories, beginning late in the recording interval at posterior electrode sites (Fig. 3b). To quantify this effect, mean amplitude measurements were taken over two time intervals, 900–1200 ms and 1200–1500 ms, at midline parietal and occipital electrode sites. ANOVA yielded only a significant main effect of condition for the 900–1200 ms interval (F1,11 = 4.79, p = 0.05), indicating that accurate-memory ERPs were more positive than false-memory ERPs.

We also analyzed results from all lateral electrode sites for four consecutive time intervals (Table 2). Study-phase results confirmed a greater posterior positivity for later false memories than for later correct rejections (word-only trials). Test-phase results likewise confirmed the enhanced posterior positivity at 900–1200 ms for accurate picture memories. A significant ERP difference between accurate and false memories was also observed at central locations from 300–600 ms, but only in the RT-matched data.

Table 2 Mean amplitude differences between conditions at lateral electrodes (μV).

Additional comparisons between test-phase conditions showed that occipital responses for accurate picture memories exhibited the largest positive amplitudes (Fig. 4). In contrast, for two conditions in which corresponding pictures had not been seen—'new correct rejections' (when subjects correctly responded that they had not seen a picture of a new item) and 'old word-only, correct rejections' (when subjects correctly responded for an item presented in the study phase as a word only)—ERP amplitudes from 600–1200 ms were significantly smaller than for accurate picture memories (t11 = 2.89, p = 0.015 and t11 = 3.86, p = 0.003, respectively). ERPs for false memories were also smaller than those for accurate picture memories (t11 = 2.86, p = 0.015). A similar difference between accurate picture memories and 'forgotten pictures' (when subjects responded that they had not seen a picture of a word plus picture item) was not significant (t = 1.48), as fewer trials were available for forgotten pictures, on average. An insufficient number of trials were available to analyze results for 'new, false alarms' (when subjects responded that they had seen a picture of a new item). All other pairwise comparisons were not significant.

Figure 4: Midline occipital ERPs for multiple test-phase response categories.
figure 4

All trials (not just RT-matched trials) were included for accurate picture memories. Mean ERP amplitudes from 600–1200 ms were 2.05 μV (accurate picture memories), 0.85 μV (new, correct rejections), 0.53 μV (old word only, correct rejections), 1.03 μV (false memories) and 1.09 μV (forgotten pictures).

Discussion

Our experimental procedure led people to occasionally misattribute their memory of an imagined object to a memory of actually viewing a picture of that object. A corresponding electrophysiological difference was found in the study phase; posterior ERPs on word-only trials were more positive for those later triggering a false memory. Also, test-phase ERPs included an enhanced posterior positivity for accurate picture memories relative to false memories. We were thus able to use brain potentials to study neural processes related to the occurrence of false memories, both at encoding and at retrieval.

In the test phase, we compared trials in which participants accurately remembered seeing an object earlier to trials in which participants erroneously claimed to have seen an object that they had only imagined earlier. Despite identical button-press responses in the two conditions, recollective experience may not have been the same. Indeed, neural responses were not the same. Specifically, we interpret late posterior ERP differences as a reflection of the difference in the amount of perceptual detail accompanying true and false memories. Subjective differences between true and false memories have been found in behavioral experiments in which participants reported experiencing more perceptual detail for true compared to false memories, despite responding in the same way by making an 'old' response in a recognition test17,18. In our experiment, late posterior ERPs may reflect the retrieval of detailed visual information, which was greater for true memories because of the nature of the visual information stored at encoding. In addition, posterior ERPs for accurate picture memories were more positive than in other control conditions in which pictures had not been seen. Our results are thus consistent with the notion that the relative amount of perceptual detail associated with memory retrieval can be indexed by posterior ERPs.

This interpretation builds on our earlier findings of ERPs associated with visual imagery20. Furthermore, neuroimaging investigations of memory-related visual imagery reported activation of a medial parietal lobe area called the precuneus29,30. Precuneus activations are proposed to reflect the reactivation of stored visual engrams, or to reflect the activity of a visual buffer, where retrieved visual information is maintained for processing after retrieval29,30,31,32,33. Although our present results do not allow us to ascribe ERP differences to activity in specific brain regions, our chief conclusions do not depend on doing so. However, we speculate that the positivity at posterior regions may be a reflection of greater precuneus activation for accurate picture memories than for false memories, because more visual detail is available for accurate memories.

We also observed a difference in study-phase ERPs between word-only items later falsely remembered as pictures and those later correctly rejected. This finding indicates that neural processes engaged at encoding predicted whether or not a particular item would later be falsely remembered. The effect mirrors many earlier observations in which items later remembered elicited a more positive ERP at encoding than did items later forgotten. Accounts of these Dm effects have emphasized they are not unitary, but rather encompass a collection of effects that can vary with the particular tasks, stimuli and strategies used27,28. In the current experiment, ERPs that varied with later memory performance were recorded while participants were generating visual images in response to names of objects, and these ERP differences were largest at parietal and occipital scalp locations. Given that positive amplitudes of late posterior ERPs in similar experiments were enhanced by visual imagery20,34, the current finding of larger ERPs for later false memories can be interpreted as more robust or vivid visual imagery for those items. These results thus support speculations that visual imagery has a role in the creation of false memories2 and reality-monitoring errors3. In the present experiment, especially vivid visual imagery during the study phase, as indexed by posterior ERPs, increased the likelihood that participants would later incorrectly claim that they had seen a picture of the imagined object.

The two principal electrophysiological findings in this experiment provided complementary evidence concerning the involvement of visual imagery in false memories. First, posterior ERPs in response to words at encoding were more positive for items erroneously remembered in the test phase as pictures. Because such potentials also vary with direct manipulations of imagery20,34, we conclude that visual imagery was more vivid for those word-only study items that were later falsely remembered. Second, posterior ERPs associated with true and false memories differed late in the recording epoch, which we take as a reflection of greater perceptual detail during accurate memories. In other words, strong visual imagery at encoding can promote a subsequent reality-monitoring error, but the imagery retrieved for those items during a recognition test is weaker than imagery associated with veridical visual memories.

Methods

From the Northwestern University community, we recruited right-handed, native English speakers, 18–26 years old, with no history of epilepsy or other neurological disease and no recent use of psychoactive medications. Data from 13 participants were excluded from analyses due to an insufficient number of artifact-free trials (< 25) in the critical experimental conditions. Accordingly, data from 12 participants (3 men and 9 women) were used for all analyses except for the study-phase ERP analysis, for which data from one additional participant were excluded. Experimental procedures were approved by the Northwestern University Institutional Review Board.

The stimuli were 525 spoken words, 350 printed words and 350 color photographs. The words were concrete nouns from an online psycholinguistic database35. Spoken words were recorded digitally at a sampling rate of 22 kHz.

After electrodes were attached (see below), each participant was led to a sound-attenuating chamber and seated in a chair about 140 cm from a computer monitor and pair of speakers. To reduce ERP artifacts, participants were instructed to relax muscles, to blink as infrequently as possible during experimental runs, and to minimize body and eye movement. Once acceptable bioelectric signals were observed, participants received instructions for the experimental tasks.

The experiment consisted of two parts, a study phase and a test phase, each with seven runs. Participants were told that we were recording brain responses during visual imagery, but were not told that there would be a memory test. Responses were registered with two buttons, one held in each hand.

In the study phase, visual words were presented, and participants were required to visualize the referent of each word and decide if the object was larger or smaller than the computer monitor in front of them. In addition, for half of the words, a picture of the object was also presented after the word. For the other half, a blank rectangle was presented instead. The words that were accompanied by a picture were counterbalanced across participants, and these 'word plus picture' trials were randomly intermixed with 'word-only' trials (Fig. 1). Participants were told to focus on the imagery task for the words and that no response was required for the pictures or rectangles. Fifty words were presented in each of the seven runs.

The test phase began immediately after the study phase. In each test run, participants heard 75 spoken words. For each word, participants were told to respond with one hand if a picture of that object had been seen in the study phase and with the other hand if a picture had not been seen (that is, unstudied words or words presented without a picture). The assignment of left or right hand to each response was counterbalanced across participants. The word list was composed of 175 words that had been seen as words and pictures, 175 words that had been seen as words only and 175 unstudied words, all randomly intermixed (Fig. 1).

The electroencephalogram (EEG) was recorded using tin electrodes embedded in an elastic cap. The following 21 scalp electrodes from the International 10–20 system were used: Fpz, Fz, Cz, Pz, Oz, Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5 and T6. Recordings were referenced initially to left mastoid and digitally re-referenced offline to the average of left and right mastoid. The electro-oculogram (EOG) was recorded using an electrode placed beneath the right eye referenced to left mastoid, and electrodes placed lateral to each eye referenced to one another. Signals were amplified with a 0.1–100 Hz bandpass filter and digitized at a rate of 250 samples per second continuously. For averaging, data were resampled at a rate of 125 samples per second with a 50-Hz low-pass filter. Each trial consisted of a 1940-ms epoch preceded by a 100-ms prestimulus baseline. Data were examined for EOG and muscle artifacts, and trials with such artifacts were removed before averaging. (An average of 28.6% of the trials were rejected.) Hypothesis-driven ERP analyses focused on midline occipital and parietal electrode sites (alpha level, 0.05), based on ERP findings from a related protocol20 and on our hypotheses concerning the effects of visual imagery.

Before the analyses, we implemented an RT-based matching procedure on the test-phase ERP data. The mean RT was 146 ± 22 ms slower for false memories ('yes' responses to word-only items) than for accurate picture memories; this difference was significant (t11 = 6.75, p < 0.001). This RT difference is potentially problematic because it could confound the central ERP comparison. To circumvent this problem, the RT distributions of the two conditions were closely matched by selecting an accurate picture memory trial for each false memory trial based on RT, a procedure that was possible because the original RT distributions overlapped considerably (Fig. 5). The resulting ERPs were thus based on an equivalent number of trials and an equivalent RT distribution for the two conditions.

Figure 5: Response time distributions for accurate picture memory responses and false memory responses before matching.
figure 5

Histogram of response times from all 12 participants in the experiment.