Abstract
Much of what we remember is not because of intentional selection, but simply a by-product of perceiving. This raises a foundational question about the architecture of the mind: how does perception interface with and influence memory? Here, inspired by a classic proposal relating perceptual processing to memory durability, the level-of-processing theory, we present a sparse coding model for compressing feature embeddings of images, and show that the reconstruction residuals from this model predict how well images are encoded into memory. In an open memorability dataset of scene images, we show that reconstruction error not only explains memory accuracy, but also response latencies during retrieval, subsuming, in the latter case, all of the variance explained by powerful vision-only models. We also confirm a prediction of this account with ‘model-driven psychophysics’. This work establishes reconstruction error as an important signal interfacing perception and memory, possibly through adaptive modulation of perceptual processing.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Data used in Studies 1 and 2 are from a publicly available dataset from Isola et al.12 (https://web.mit.edu/phillipi/Public/MemorabilityPAMI/index.html). De-identified data collected for Study 3 have been deposited on GitHub (https://github.com/CNCLgithub/ReconMem)68.
Code availability
Codes have been deposited on GitHub (https://github.com/CNCLgithub/ReconMem)68.
References
Wagner, A. D. et al. Building memories: remembering and forgetting of verbal experiences as predicted by brain activity. Science 281, 1188–1191 (1998).
Xue, G. The neural representations underlying human episodic memory. Trends Cogn. Sci. 22, 544–561 (2018).
Craik, F. I. & Lockhart, R. S. Levels of processing: a framework for memory research. J. Verbal Learning Verbal Behav. 11, 671–684 (1972).
Schurgin, M. W., Wixted, J. T. & Brady, T. F. Psychophysical scaling reveals a unified theory of visual memory strength. Nat. Hum. Behav. 4, 1156–1172 (2020).
Chun, M. M. & Johnson, M. K. Memory: enduring traces of perceptual and reflective attention. Neuron 72, 520–535 (2011).
Kurby, C. A. & Zacks, J. M. Segmentation in the perception and memory of events. Trends Cogn. Sci. 12, 72–79 (2008).
Favila, S. E., Lee, H. & Kuhl, B. A. Transforming the concept of memory reactivation. Trends Neurosci. 43, 939–950 (2020).
Liu, J. et al. Transformative neural representations support long-term episodic memory. Sci. Adv. 7, eabg9715 (2021).
Libby, A. & Buschman, T. J. Rotational dynamics reduce interference between sensory and memory representations. Nat. Neurosci. 24, 715–726 (2021).
Serences, J. T. Neural mechanisms of information storage in visual short-term memory. Vision Res. 128, 53–67 (2016).
Xu, Y. Reevaluating the sensory account of visual working memory storage. Trends Cogn. Sci. 21, 794–815 (2017).
Isola, P., Xiao, J., Parikh, D., Torralba, A. & Oliva, A. What makes a photograph memorable? IEEE Trans. Pattern Anal. Mach. Intell. 7, 1469–1482 (2014).
Bainbridge, W. A., Isola, P. & Oliva, A. The intrinsic memorability of face photographs. J. Exp. Psychol. Gen. 142, 1323–1334 (2013).
Jaegle, A. et al. Population response magnitude variation in inferotemporal cortex predicts image memorability. eLife 8, e47596 (2019).
Khosla, A., Raju, A. S., Torralba, A. & Oliva, A. Understanding and predicting image memorability at a large scale. In Proc. IEEE International Conference on Computer Vision, 2390–2398 (2015).
Lin, Q., Yousif, S. R., Scholl, B. & Chun, M. M. Image memorability is driven by visual and conceptual distinctivenes. J. Vis. 19, 290c (2019).
Kramer, M. A., Hebart, M. N., Baker, C. I. & Bainbridge, W. A. The features underlying the memorability of objects. Sci. Adv. 9, eadd2981 (2023).
Baddeley, A. D. The trouble with levels: a reexamination of Craik and Lockhart’s framework for memory research. Psychol. Rev. 85, 139–152 (1978).
Treisman, A. in Levels of Processing in Human Memory (eds Cermak, L. S. & Craik, F. I. M.) 301–330 (Psychology Press, 2014).
Craik, F. I. Remembering: an activity of mind and brain. Annu. Rev. Psychol. 71, 1–24 (2020).
Cermak, L. S. & Craik, F. I. M. Levels of Processing in Human Memory (Psychology Press, 2014).
Bainbridge, W. A. The resiliency of image memorability: a predictor of memory separate from attention and priming. Neuropsychologia 141, 107408 (2020).
Bates, C. J. & Jacobs, R. A. Efficient data compression in perception and perceptual memory. Psychol. Rev. 127, 891–917 (2020).
Schacter, D. L. Adaptive constructive processes and the future of memory. Am. Psychol. 67, 603–613 (2012).
Hemmer, P. & Steyvers, M. A Bayesian account of reconstructive memory. Top. Cogn. Sci. 1, 189–202 (2009).
Olshausen, B. A. & Field, D. J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996).
Olshausen, B. A. & Field, D. J. Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Res. 37, 3311–3325 (1997).
Benna, M. K. & Fusi, S. Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence. Proc. Natl Acad. Sci. USA 118, e2018422118 (2021).
Lewicki, M. S. Efficient coding of natural sounds. Nat. Neurosci. 5, 356–363 (2002).
Zemel, R. & Hinton, G. E. Developing population codes by minimizing description length. Adv. Neural Info. Process. Syst. 6, 11–18 (1993).
Rozell, C. J., Johnson, D. H., Baraniuk, R. G. & Olshausen, B. A. Sparse coding via thresholding and local competition in neural circuits. Neural Comput. 20, 2526–2563 (2008).
Rumelhart, D., Hinton, G. & Williams, R. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Gregor, K. & LeCun, Y. Learning fast approximations of sparse coding. In Proc. 27th International Conference on International Conference on Machine Learning, 399–406 (2010).
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A. & Torralba, A. Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. of the 3rd International Conference on Learning Representations 1–14 (ICLR, 2015).
Berger, T. Rate Distortion Theory: A Mathematical Basis for Data Compression (Prentice-Hall, 1971).
Cover, T. M. & Thomas, J. A. Elements of Information Theory (Wiley, 1991).
MacKay, D. J. Information Theory, Inference, and Learning Algorithms (Cambridge Univ. Press, 2003).
Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The ‘wake–sleep’ algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).
Kahana, M. & Loftus, G. in The Nature of Cognition (ed. Sternberg, R. J.) 322–384 (MIT Press, 1999).
Bylinskii, Z., Isola, P., Bainbridge, C., Torralba, A. & Oliva, A. Intrinsic and extrinsic effects on image memorability. Vision Res. 116, 165–178 (2015).
Vincent, A., Craik, F. I. & Furedy, J. J. Relations among memory performance, mental workload and cardiovascular responses. Int. J. Psychophysiol. 23, 181–198 (1996).
Ragland, J. D. et al. Levels-of-processing effect on word recognition in schizophrenia. Biol. Psychiatry 54, 1154–1161 (2003).
Broers, N., Potter, M. C. & Nieuwenstein, M. R. Enhanced recognition of memorable pictures in ultra-fast RSVP. Psychon. Bull. Rev. 25, 1080–1086 (2018).
Craik, F. I. Levels of processing: past, present… and future? Memory 10, 305–318 (2002).
Friston, K. & Kiebel, S. Predictive coding under the free-energy principle. Philos. Trans. R. Soc. B 364, 1211–1221 (2009).
Rosenbaum, R. On the relationship between predictive coding and backpropagation. PLoS ONE 17, e0266102 (2022).
Barrow, H. G. & Tenenbaum, J. M. In Computer Vision Systems (eds. Hanson A. & Riseman E. M.) 3–26 (Academic Press, 1978).
Olshausen, B. A., Mangun, G. & Gazzaniga, M. Perception as an Inference Problem (MIT Press, 2014).
Yuille, A. & Kersten, D. Vision as Bayesian inference: analysis by synthesis? Trends Cogn. Sci. 10, 301–308 (2006).
Mumford, D. in Large-Scale Neuronal Theories of the Brain (eds Koch, C & Davis, J. L.) 125–152 (MIT Press, 1994).
Brewer, J. B., Zhao, Z., Desmond, J. E., Glover, G. H. & Gabrieli, J. D. Making memories: brain activity that predicts how well visual experience will be remembered. Science 281, 1185–1187 (1998).
Paller, K. A. & Wagner, A. D. Observing the transformation of experience into memory. Trends Cogn. Sci. 6, 93–102 (2002).
Kim, H. Neural activity that predicts subsequent memory and forgetting: a meta-analysis of 74 fMRI studies. Neuroimage 54, 2446–2461 (2011).
Xue, G. et al. Greater neural pattern similarity across repetitions is associated with better memory. Science 330, 97–101 (2010).
Ward, E. J., Chun, M. M. & Kuhl, B. A. Repetition suppression and multi-voxel pattern similarity differentially track implicit and explicit visual memory. J. Neurosci. 33, 14749–14757 (2013).
Voss, J. L., Bridge, D. J., Cohen, N. J. & Walker, J. A. A closer look at the hippocampus and memory. Trends Cogn. Sci. 21, 577–588 (2017).
Ryan, J. D., Shen, K. & Liu, Z.-X. The intersection between the oculomotor and hippocampal memory systems: empirical developments and clinical implications. Ann. N Y Acad. Sci. 1464, 115–141 (2020).
Kragel, J. E. & Voss, J. L. Looking for the neural basis of memory. Trends Cogn. Sci. 26, 53–65 (2022).
Lyu, M. et al. Overt attentional correlates of memorability of scene images and their relationships to scene semantics. J. Vis. 20, 1–17 (2020).
Cohendet, R., Demarty, C.-H., Duong, N. Q. & Engilberge, M. Videomem: constructing, analyzing, predicting short-term and long-term video memorability. In Proc. IEEE/CVF International Conference on Computer Vision, 2531–2540 (2019).
Xu, Q., Fang, F., Molino, A., Subbaraju, V. & Lim, J.-H. Predicting event memorability from contextual visual semantics. Adv. Neural Info. Process. Syst. 34, 22431–22442 (2021).
Lau, M. C., Goh, W. D. & Yap, M. J. An item-level analysis of lexical-semantic effects in free recall and recognition memory using the megastudy approach. Q. J. Exp. Psychol. (Hove) 71, 2207–2222 (2018).
Majumdar, A. et al. Where are we in the search for an artificial visual cortex for embodied intelligence? Adv. Neural Info. Process. Syst. 36, 1–23 (2024).
Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog 1, 9 (2019).
Stahl, A. E. & Feigenson, L. Observing the unexpected enhances infants’ learning and exploration. Science 348, 91–94 (2015).
Chollet, F. et al. Keras. https://keras.io (2015).
Lin, Q., Li, Z., Lafferty, J. & Yildirim, I. From seeing to remembering: Images with harder-to-reconstruct representations leave stronger memory traces. GitHub https://github.com/CNCLgithub/ReconMem (2023).
Acknowledgements
This project was funded by an Air Force Office of Scientific Research (AFOSR) award #FA9550-22-1-0041 (to I.Y.). The funder had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank the Yale Center for Research Computing for maintaining HPC resources for computation. We also thank R. Jacobs, B. Scholl and members of the Yale Cognitive & Neural Computation Lab for comments on an earlier version of this manuscript.
Author information
Authors and Affiliations
Contributions
Q.L., J.L. and I.Y. conceived the study. Q.L., Z.L., J.L. and I.Y. developed the methodology. Q.L. and Z.L. developed the software. Q.L. collected the data. Q.L. and Z.L. formally analysed the data. Q.L., Z.L. and I.Y. wrote the original draft. Q.L., Z.L., J.L. and I.Y. wrote and edited the manuscript. Q.L. visualized the data. J.L. and I.Y. supervised the study. I.Y. acquired the funding.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Human Behaviour thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, Q., Li, Z., Lafferty, J. et al. Images with harder-to-reconstruct visual representations leave stronger memory traces. Nat Hum Behav 8, 1309–1320 (2024). https://doi.org/10.1038/s41562-024-01870-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-024-01870-3
This article is cited by
-
Images with harder-to-reconstruct visual representations leave stronger memory traces
Nature Human Behaviour (2024)