Building a complete representation of a visual scene requires information to be remembered across separate glances and over time1,2, but it has been suggested that visual details are forgotten soon after they are viewed3,4,5,6. Here I show that cumulative memory build-up allows the same number of objects to be recalled, irrespective of whether these were seen in a series of short, separate presentations several minutes apart, or as one continuous presentation of the same total duration. I find that the build-up of visual memory over time is much better than has been widely thought and may underlie the successful performance of real-world visual and cognitive tasks that require people to keep track of objects in the immediate environment.

Scenes were computer-generated 'rooms' containing 12 unrelated objects (Fig. 1). They were presented for a cumulative viewing duration of 1, 2, 3 or 4 seconds, either all at once (continuous-presentation trials) or as brief views of 0.25, 1 or 2 seconds in duration and separated by as many as eight other trials (re-test trials). Participants (n = 6; vision was normal or corrected to normal) did not know whether, or when, a particular scene would be re-tested, or whether a trial would contain an old or a new scene. The long intervals between re-tests (0.5–4.0 min), the presence of intervening presentations, and the random occurrence of re-tests precluded effective rehearsal.

Figure 1: Visual memory for objects in a scene.
figure 1

Representative computer-generated display of 12 objects (1–2° of visual angle) in a room (10 × 8° of visual angle). The blue line shows the scanning path followed by one subject. Objects were presented for a cumulative viewing duration either as a continuous presentation or as brief separate views summing to the same duration. The number of items recalled was similar in both cases.

Memory improved with re-testing. The number of items recalled after a series of separate brief views was almost identical to the number recalled after a single continuous view of the same total duration (Fig. 2). For example, the number of items recalled after the last of four separate 1-second trials was equal to the number of items recalled after 4 seconds of continuous presentation. Memory of each scene continued to accumulate over repeated viewings as though the scene had never been out of sight.

Figure 2: Number of items recalled as a function of total viewing time by two representative participants (S.E. and B.S.S.).
figure 2

Solid lines, continuous trials of 1, 2 or 4 s; dashed lines, performance after 1, 2, 3 or 4 re-test trials of 1 s each; squares, 4 trials of 250 ms each; triangles, 2 trials of 2 s each. Error bars represent standard error. Four other participants generated similar results (data not shown).

I investigated the nature of memory accumulation by presenting new objects on previously viewed backgrounds. If the entire scene (objects plus background) is encoded in the memory, then a familiar background should bring to mind the memory of the original set of objects and interfere with memory for the new set of objects7, which I found to be the case.

Memory was poorer for new objects seen against old backgrounds than for completely new scenes (P < 0.05; 6 subjects). When object names, such as 'apple', were presented in place of a picture of the object, the memory accumulation and the effect of repeating previously viewed backgrounds were both virtually abolished. These results suggest that a visuo-spatial representation of the whole scene was remembered across re-tests, rather than simply a verbal list of object names.

Earlier studies of visual memory produced diverse estimates of capacity4,5,6, with memory being either inferred from performance of a concurrent task or assessed using displays containing semantic cues that may have influenced encoding8,9,10 or guessing11 strategies. The experiments I describe here remove semantic cues and evaluate memory capacity by using measures of recall rather than by a secondary task.

The visual memory reported here is unusual in that it does not resemble traditional short- or long-term memory. Short-term memory is not involved because the time between re-tests of the same display exceeded the temporal limits of information summation12 and short-term memory13. Also, the presence of intervening scenes during the intervals between re-tests precluded rehearsal as a means of retaining the items in short-term storage. Traditional long-term memory was not involved either, because there was no build-up across days. The memory studied here may therefore be best described as 'medium-term' or 'disposable'.

Medium-term memory may underlie the ability to keep in mind the identity and location of objects while performing visuo-motor tasks that last for a few minutes within an unchanging visual environment2,14,15. A medium-term visual memory could be instrumental in quickly directing the eye or arm to selected objects without the continual need for expensive or time-consuming visual searching.