Re-expression of CA1 and entorhinal activity patterns preserves temporal context memory at long timescales

Converging, cross-species evidence indicates that memory for time is supported by hippocampal area CA1 and entorhinal cortex. However, limited evidence characterizes how these regions preserve temporal memories over long timescales (e.g., months). At long timescales, memoranda may be encountered in multiple temporal contexts, potentially creating interference. Here, using 7T fMRI, we measured CA1 and entorhinal activity patterns as human participants viewed thousands of natural scene images distributed, and repeated, across many months. We show that memory for an image’s original temporal context was predicted by the degree to which CA1/entorhinal activity patterns from the first encounter with an image were re-expressed during re-encounters occurring minutes to months later. Critically, temporal memory signals were dissociable from predictors of recognition confidence, which were carried by distinct medial temporal lobe expressions. These findings suggest that CA1 and entorhinal cortex preserve temporal memories across long timescales by coding for and reinstating temporal context information.

7T fMRI, we measured CA1 and entorhinal activity patterns as human participants viewed 23 thousands of natural scene images distributed, and repeated, across many months. We show that 24 memory for an image's original temporal context was predicted by the degree to which 25 CA1/entorhinal activity patterns from the first encounter with an image were re-expressed during 26 re-encounters occurring minutes to months later. Critically, temporal memory signals were 27 dissociable from predictors of recognition confidence, which were carried by distinct medial 28 temporal lobe expressions. These findings suggest that CA1 and entorhinal cortex preserve 29 temporal memories across long timescales by coding for and reinstating temporal context 30 information.

INTRODUCTION 32
Episodic memory fundamentally involves the ability to remember not only what happened in the 33 past, but when it happened 1 . Indeed, placing memories in time critically enables experiences to 34 be organized into personal narratives that span weeks, months, and years 2 . Yet, the majority of 35 cognitive neuroscience studies of human memory only consider memory across relatively short 36 timescales (overwhelmingly, within a single experimental session/day). At longer timescales, 37 one of the particular challenges to retaining precise temporal memories is that previously-38 encoded information is likely to be 're-encountered' in new temporal contexts 3 . For example, 39 remembering precisely when you first saw a particular movie may be complicated by re-40 watching that movie at a later date. Understanding how memories of specific temporal contexts 41 are preserved when experiences are repeated over long timescales (days, weeks, months) requires 42 identifying not only the neural structures that are involved, but the mechanistic contributions that 43 these structures support. 44 Broadly, the medial temporal lobe (MTL) system is known to critically support episodic 45 memory 4-6 . However, within the MTL system, hippocampal subfield CA1 and entorhinal cortex 46 (ERC) have emerged as being particularly important for processing and remembering temporal 47 information 7-11 . For example, so-called "time cells" in CA1 and ERC have been shown to code 48 for elapsed time in rodents 12-15 , with similar effects recently observed in the human hippocampus 49 and ERC 16,17 . Putatively, time cells in CA1 and ERC provide the basis for temporal context 50 representations that allow individual memories to be 'placed' in time 18 . While human fMRI 51 studies have provided important evidence that activation levels in the hippocampus and ERC are 52 associated with the precision of temporal memory 19,20 , measures of activation, alone, are not well 53 suited to measuring temporal context representations. Rather, temporal context is thought to be 54 reflected in distributed patterns of activity or ensemble representations 21,22 . 55 Importantly, to the extent that CA1 and ERC do code for the temporal context in which events 56 occur, there are multiple-and mechanistically distinct-ways in which these representations 57 might preserve temporal memories. On the one hand, when a given stimulus is re-encountered in 58 a new temporal context, CA1 and/or ERC may encode the new temporal context as distinct from 59 the original context 23 . Forming distinct temporal context representations across repeated 60 encounters is potentially beneficial to temporal memory by improving discriminability of these 61 contexts 24 . On the other hand, when a stimulus is re-encountered in a new temporal context, this 62 potentially creates an opportunity to reinstate a prior temporal context 25,26 . For example, when a 63 familiar movie is on television, this might trigger recall of the original temporal context in which 64 the movie was encountered. Reinstatement of the original temporal context may strengthen that 65 context representation and thereby preserve memory for when the movie was first encountered. 66 Critically-and in contrast to a context distinctiveness account-a context reinstatement account 67 makes the prediction that, when a stimulus is re-encountered, memory for the original temporal 68 context will be preserved to the extent that activity patterns in CA1 and/or ERC are similar to (or 69 reinstate) the activity patterns expressed when the stimulus was first encountered. 70 Here we sought to characterize the neural mechanisms that preserve temporal context memory 71 when events are re-encountered across long timescales (days to months). To address this, we 72 describe a massive human fMRI experiment in which participants encountered thousands of 73 natural scene images repeatedly during 30-40 scan sessions distributed over an 8-10 month 74 window 27 . After all scans were completed, participants performed a temporal memory task in 75 which a subset of images were presented and participants were asked to estimate when each 76 image was first encountered (on a scale that ranged from days to months in the past). The focus 77 of our analyses was to test whether temporal memory precision was predicted by the degree to 78 which patterns of neural activity expressed when images were first encountered were re-79 expressed when these images were re-encountered (a potential marker of context reinstatement). 80 By leveraging the ultra-high field strength (7T) and high spatial resolution (1.8-mm) of our 81 imaging protocol, we interrogated subregions of the hippocampus (including CA1) and 82 surrounding MTL structures (including ERC). This experimental design yielded an 83 unprecedented ability to understand how temporally-precise memories are preserved over long 84 timescales that are critical for real-world memories. 85

RESULTS 86
Precise temporal memory persists across months 87 Eight participants completed two experimental phases (Fig. 1a). The first phase consisted of a 88 continuous recognition task conducted during fMRI scanning. The second phase consisted of a 89 final memory test conducted outside of the scanner. During the continuous recognition phase, 90 participants viewed 9,209-10,000 natural scene images across 30-40 fMRI sessions and indicated 91 whether or not each image had previously been encountered at any point in the experiment (Fig.  92 1b). Each image was presented up to three times with these exposures pseudo-randomly 93 distributed across the entire experiment ( Fig. 1d). At least two days after completion of the last 94 session of the continuous recognition phase, participants completed a final memory test on a 95 subset of images (Fig. 1c). Each trial of the final memory test began with a recognition memory 96 judgment on a 1-6 confidence scale (1: 'high confidence new', 6: 'high confidence old'). For 97 images judged to be 'old', participants were also prompted to make frequency and temporal 98 memory judgments. For the frequency judgment, participants were asked how many times they 99 had seen the image during the continuous recognition phase (1, 2, 3, or 4 or more). For the 100 temporal memory judgment, which is the primary focus of the present study, participants were 101

112
(c) Final memory test. Each trial of the final memory test began with a recognition memory judgment in 113 which participants made a recognition decision together with a confidence rating from 1-6 (1: 'high 114 confidence new', 6: 'high confidence old'). For each image judged as 'old', a frequency test followed in 115 which participants were asked how many times they had seen the image before (1, 2, 3, or 4 or more).

116
Following that, participants were asked to indicate on a continuous timeline when the image in question 117 was first encountered (temporal memory test; see Methods for more information). (d) Timeline of an 118 example image. Each old image used in the final memory test was presented three times during the 119 continuous recognition phase and associated with four temporal lags. The first fMRI scan session of the 120 continuous recognition phase for each participant corresponds to Day 0. All temporal lags were quantified 121 in seconds and transformed with the natural logarithm for further analyses. (e) Behavioral measure of temporal memory. Item-wise temporal memory error was quantified as the difference between the ranked 123 actual and ranked estimated temporal positions.

125
All participants performed above chance on the recognition memory test ( Fig. 2a; hit rate greater 126 than false alarm rate: t ! = 8.24, p < 0.001, two-tailed paired-sample t-test). Separating the data 127 across three confidence levels (low, medium, and high) revealed that recognition memory 128 accuracy (d') increased with levels of subjective confidence ( Fig. 2b; F ",$% = 16.66, p < 0.001, 129 one-way repeated-measures ANOVA). Results for the frequency test are reported in 130 Supplementary Fig. 1. 131 Of critical interest was the accuracy of temporal memory judgments, which required participants 132 to recall the first time each scene was encountered over the course of the up to 10-month 133 experiment. To reduce the effects of non-linearity in temporal memory judgments (e.g., response 134 bias towards the center of the timeline, see Methods and Supplementary Fig. 2), we converted 135 both the actual (objective) and the estimated (subjective) temporal positions to ranked positions 136 for further analyses. Based on the ranks, we quantified item-wise temporal memory error by 137 comparing the distance between the actual and estimated ranked positions (Fig. 1e). To 138 determine temporal accuracy across participants, we ran a mixed-effects linear regression model 139 for estimated against actual temporal position with participants as a random effect. Results from 140 this analysis indicated that participants were able to place images in their correct temporal 141 contexts with above-chance accuracy ( Fig. 2c; group-level = 0.302, p < 0.001). We further 142 evaluated temporal memory accuracy for each participant using a permutation test (see 143 Methods). This analysis revealed that temporal memory performance was above chance for 144 seven out of the eight participants ( Fig. 2d; ps < 0.01; one participant: p = 0.083). The relatively 145 high accuracy of the temporal memory judgments is notable when considering that participants 146 were not informed that they would be tested on temporal memory until after all of the continuous 147 recognition sessions. 148

CA1 and entorhinal representational similarity across exposures predicts temporal 163 memory precision 164
The primary goal of the present study was to investigate whether the similarity (or dissimilarity) 165 of MTL representations across repeated stimulus encounters predicts the accuracy of temporal 166 memory judgments across long timescales. Accordingly, we examined the representational 167 similarity between exposures of each of the images that were subsequently probed in the 168 temporal memory test. Given our a priori interest in MTL structures, we focused on two 169 manually segmented subfields of the hippocampus (CA1 and CA2/3/dentate gyrus, hereafter 170 CA2/3/DG), along with ERC, perirhinal cortex (PRC), and parahippocampal cortex (PHC) (Fig.  171 3a). For each region of interest (ROI), we correlated the activity patterns between each pair of 172 exposures of the same image (i.e., r(E1, E2), r(E2, E3), and r(E1, E3)). As a first step, we 173 averaged across these pairwise correlations to generate a single similarity metric (across 174 exposures) for each image (Fig. 3b). We then compared these similarity metrics for images 175 associated with high versus low temporal memory precision (based on a participant-specific 176 median split). Statistical significance of the difference between high and low temporal memory 177 precision was evaluated using a permutation test that shuffles the images' temporal memory 178 identities within each participant. Among the set of MTL ROIs, CA1 and ERC exhibited 179 significantly greater pattern similarity across repeated exposures for high-precision images 180 relative to low-precision images ( Fig. 3c; CA1: p = 0.004; ERC: p = 0.004; permutation tests). 181 The fact that temporal memory precision was associated with greater pattern similarity across 182 exposures in CA1 and ERC is consistent with a context reinstatement account, wherein the 183 original temporal context is reinstated (and strengthened) during subsequent exposures. 184 We next performed several control analyses. First, because temporal memory precision increased 185 as a function of the session position in which the first exposure occurred (recency effect, see 186 Supplementary Fig. 3), we repeated the analyses for CA1 and ERC while explicitly accounting 187 for temporal lag information (Fig. 1d). Specifically, we ran a mixed-effects logistic regression 188 model that predicted temporal memory precision from pattern similarity across exposures with 189 temporal lags (lag 0-3) included as fixed effects and participant included as a random effect. This 190 analysis confirmed that the relationship between pattern similarity in CA1/ERC and temporal 191 memory precision remained significant when accounting for temporal lag information ( Second, we repeated the foregoing analyses for an early visual cortex ROI (V1) that would be 194 sensitive to low-level visual information but would not be expected to code for temporal context. 195 As expected, V1 pattern similarity across exposures did not differ for high-versus low-precision 196 images ( Fig. 3c; p = 0.25; permutation test) and was not a predictor of temporal memory 197 precision ( Fig. 3d; p = 0.376; logistic mixed-effects regression). Likewise, an additional, 198 exploratory whole-brain analysis did not identify any cortical areas outside the MTL for which 199 the relationship between pattern similarity and temporal memory was significant after correction 200 for multiple comparisons (Supplementary Table 1 Third, and critically, we next tested whether the effects observed in CA1 and ERC were specific 223 to temporal memory. To this end, we repeated the same mixed-effects regression model but now 224 used recognition confidence as the dependent variable instead of temporal precision. Neither 225 CA1 nor ERC exhibited significant relationships between pattern similarity and recognition 226 confidence (ps > 0.10). In contrast, pattern similarity was a significant predictor of recognition 227 confidence in PHC ( Fig. 3e; = 0.799, p < 0.001). A follow-up control analysis which included 228 recognition confidence together with pattern similarity as fixed effects in a mixed-effects 229 regression model confirmed that pattern similarity in CA1 and ERC predicted temporal memory 230 precision when accounting for recognition confidence (ps < 0.001). These results provide 231 important evidence that the relationships between CA1/ERC pattern similarity and temporal 232 memory precision were not a secondary consequence of stronger overall memory for the images; 233 rather, pattern similarity across exposures in CA1 and ERC specifically predicted better memory 234 for when (in time) images were first encountered. 235

Similarity between first and second exposures uniquely predicts temporal memory 236
Having demonstrated that CA1 and ERC pattern similarity across repeated exposures predicts 237 temporal memory for an image's first exposure, we next sought to determine which pair of image 238 exposures was most predictive of temporal memory. From a context reinstatement perspective, 239 similarity between the first exposure (E1) and the second exposure (E2) should be uniquely 240 important because E2 provides the first opportunity to reinstate the temporal context from E1. To 241 test this, we first compared pattern similarity for high-and low-precision images for each pair of 242 image exposures (E1-E2, E2-E3, and E1-E3). Statistical significance of the difference between 243 high-and low-precision images for each exposure pair was computed a permutation analysis in 244 which, for each participant and exposure pair, we randomly shuffled the images' temporal 245 memory precision labels. For both CA1 and ERC, E1-E2 similarity was significantly greater for 246 high-than low-precision images ( Fig. 4a; CA1: p = 0.015; ERC: p = 0.007; permutation tests). 247 However, both regions also exhibited similar effects for E2-E3 similarity ( Fig. 4a; CA1: p = 248 0.025; ERC: p = 0.036, permutation tests). Neither region exhibited a significant effect for E1-E3 249 similarity ( Fig. 4a; ps > 0.28). 250 To further explore this pattern of results, we performed three follow-up sets of analyses. First, in 251 order to control for potential temporal lag effects ( Supplementary Fig. 3 Effects were marginally significant for E2-E3 similarity (ps < 0.10), and not significant for E1-257 E3 similarity (ps > 0.68). 258 Second, in order to more directly assess whether E1-E2 similarity contained predictive power 259 above and beyond that of other exposure pairs, we compared the performance of several models 260 that did or did not include various exposure pairs. That is, we tested whether model performance 261 was significantly improved when E1-E2 similarity was added to models that only included E2-262 E3 and E1-E3 similarity. For both CA1 and ERC, adding E1-E2 as a predictor significantly 263 improved the model's performance (CA1: " = 6.147, p = 0.013; ERC: " = 5.315, p = 0.021). 264 In contrast, adding E2-E3 and E1-E3 similarity as predictors to models with just E1-E2 similarity 265 did not improve the model's performance (ps > 0.15). These results established that E1-E2 266 similarity was uniquely important for subsequent temporal memory judgments, as would be 267 predicted by a context reinstatement account. 268 Third, it is possible that these patterns of results reflect the contribution of some overall 269 facilitation to memory provided by E1-E2 similarity. However, although PHC pattern similarity 270 across exposures was highly predictive of subsequent recognition memory confidence (Fig. 3e  294 dots denote individual participants; ~p < 0.10; *p < 0.05; **p < 0.01.

CA1 and ERC predict temporal memory via image-specific representations 297
While all of the preceding representational similarity analyses were performed by correlating 298 activity patterns across repeated exposures of the same stimulus (i.e., image-specific 299 correlations), these analyses do not guarantee that the information that predicted temporal 300 memory precision was specific to individual images. Namely, it is possible that temporal 301 memory precision benefited from generic memory processes or attentional states that generalized 302 across images (e.g., states optimized for memory encoding 28 ). While this possibility would still 303 support a role for CA1 and ERC in encoding temporal information, a temporal context 304 reinstatement account fundamentally predicts reinstatement of the specific temporal context in 305 which an image was encoded. 306 To assess whether temporal memory was predicted by image-specific pattern similarity, we 307 conducted two additional analyses (restricted to E1-E2 similarity). First, for all of the images 308 tested in the temporal memory test, we permuted the E1-E2 mappings by shuffling images' E2 309 within each participant. We then calculated the resulting E1-E2 pattern similarity scores and a 310 corresponding distribution of beta values reflecting the relationships with temporal memory (see 311 Methods for details). Critically, for both CA1 and ERC, the relationship between 'intact' E1-E2 312 similarity and temporal memory was significantly stronger (higher beta values) than the 313 permuted values (Fig. 5a; CA1: p = 0.019; ERC: p = 0.025). These data provide important 314 evidence that temporal memory precision was predicted by image-specific pattern similarity in 315 CA1 and ERC. 316 As a follow-up to the preceding analysis, we ran a final analysis to address whether apparent 317 image-specific effects might be due to general memory states and/or differences in coarse 318 temporal context information (i.e., session effects). Thus, for each image included in the 319 temporal memory test (a 'target'), we identified control images ('foils') such that the targets and 320 foils shared the same E1 session number, but not scanning run (to avoid potential contamination 321 from autocorrelation in the fMRI data), and the same E2 session number (but not run; Fig. 5b). 322 To match recognition memory with targets, foils were only included in this analysis if they were 323 correctly rejected at E1 and successfully recognized at E2 and E3 (see Methods for details). This 324 allowed us to compute similarity between target E1 and target E2 (target similarity) and target 325 E1 and foils E2 (foil similarity). The difference between these measures (target similarityfoil 326 similarity) was then used as a predictor of temporal memory precision. Indeed, this similarity 327 difference score significantly predicted temporal memory precision for CA1 ( Fig. 5c; = 0.893, 328 p = 0.028), with a similar but marginal effect for ERC ( Fig. 5c; = 1.240, p = 0.058). These 329 findings lend further support to the idea that temporal memory precision was related to image-330 specific pattern similarity measures and specifically argue against potential confounds due to 331 generic memory-related processes or session effects. The fact that these effects held when 332 carefully controlling for session effects (albeit marginally in ERC) is notable because it provides 333 evidence against the possibility that pattern similarity only captured coarse-level temporal 334 context (session information). Rather, to the extent that the pattern similarity measure captured 335 temporal context information, these findings suggest a relatively 'local' temporal context 336 representation that differentiated between images within the same session (day). 337  The ability to remember when events occurred in time is fundamental to human experience. 351 However, retaining precise temporal memories is complicated by the fact that real-world 352 episodic memories span long timescales (days, weeks, months and beyond) and by the fact that 353 events may recur in multiple contexts over those long timescales (e.g., a movie you have viewed 354 several times over the past year). To date, there is remarkably little evidence characterizing how 355 the human brain preserves temporal memories in the face of these challenges. Here, we show 356 that when events recur over long timescales (at lags up to several months), the re-expression of 357 distributed, event-specific activity patterns in CA1 and ERC preserves memory for the original 358 temporal context of an event (i.e., memory for when an event first occurred). These findings are 359 consistent with and bridge between prior human and rodent studies implicating CA1 and ERC in 360 temporal processing and temporal memory. However, our findings also go beyond existing 361 evidence by providing a mechanistic account of how CA1 and ERC preserve temporal memories 362 and demonstrating these relationships at uniquely long timescales. 363 While there is a rich history characterizing temporal memory in human behavioral and 364 neuroimaging studies 29,30 , it is striking how few of these studies have considered temporal 365 memory across timescales that exceed a single experimental session. Indeed, our approach of 366 testing temporal memory for images that were distributed across dozens of experimental 367 sessions/scans spanning 8-10 months is unprecedented. Considering that the overwhelming 368 majority of real-world episodic memories span days, weeks, months and years, it is imperative to 369 understand the neural mechanisms that support temporal memory at these timescales. Although it 370 is intuitively obvious that humans can and do retain temporal memories over long timescales, it 371 is nonetheless remarkable that participants in the current study were generally successful at 372 recalling the initial temporal context for images presented at the final memory test given that (a) 373 these images were drawn from a pool of tens of thousands of images, (b) the delay between the 374 initial exposure and the final memory test ranged from days to almost a year, and (c) each image 375 was presented in multiple temporal contexts, creating potential interference. Thus, by simulating 376 the challenges that are inherent to real-world temporal memory, our experimental paradigm 377 provides a unique opportunity to characterize the underlying neural mechanisms. 378 By leveraging representation-based analyses to track patterns of activity across repeated stimulus 379 exposures and distinct temporal contexts, we were able to gain critical insight into the 380 mechanisms through which CA1 and ERC contribute to temporal memory. In particular, our 381 findings strongly align with a context reinstatement account. According to temporal context 382 models 25,26 , context representations-reflected in distributed patterns of neural activity-383 gradually change over time and are reinstated when an item is subsequently remembered 31-37 . 384 From this perspective, our finding that greater pattern similarity across exposures preserved 385 memory for an event's original temporal context can be explained in terms of the original 386 context representation (elicited during E1) being reinstated during subsequent exposures (E2, 387 E3). In fact, this account also readily explains our finding that similarity between the first and 388 second exposure (E1, E2) was uniquely important for temporal memory. Namely, E2 represented 389 the first potential 'reminder' of E1's temporal context. Interestingly, although we tested for re-390 expression of E1's activity patterns by explicitly re-exposing participants to the same stimulus 391 multiple times (E2, E3), our findings likely generalize to situations where stimuli are not 392 explicitly re-exposed (or re-encountered An additional essential consideration in understanding neural mechanisms that specifically relate 425 to temporal memory is to establish that any apparent effects related to temporal memory were 426 not derivative from more general effects of memory strength. Specifically, as memories decay 427 over time, temporal judgments could potentially be inferred from the strength of memories 428 themselves 50-52 . This is of particular concern given the very long timescales involved in the 429 current study. However, several theoretical perspectives propose that memory for time is 430 dissociable from memory strength 30,53,54 . Here, our final memory test separately measured 431 recognition confidence (a proxy for overall memory strength) and temporal memory, allowing us 432 to conduct several targeted analyses aimed at teasing apart these two expressions of memory. 433 First, we found that the relationships between CA1/ERC and temporal memory precision 434 remained significant in a regression model that included recognition confidence as a covariate. 435 Second, consistent with prior arguments that distinct MTL subregions are involved in 'item-436 based' versus 'context-based' memory 55 , we found that pattern similarity measures in PHC 437 predicted recognition confidence but not temporal memory, whereas pattern similarity measures 438 in CA1 and ERC predicted temporal memory but not recognition confidence. Finally, when 439 considering pattern similarity across specific pairs of image exposures, temporal memory 440 (defined here as memory for when the first exposure occurred) was best predicted by pattern 441 similarity between the first and second exposures, consistent with a context reinstatement 442 account. In contrast, recognition confidence was best predicted by pattern similarity between the 443 first and third exposures, potentially indicating that the last (third) exposure was relatively more 444 influential to memory strength (also see Supplementary Fig. 3). Together, these data points 445 provide important, converging evidence that temporal memory judgments in the current study 446 were not derived from the overall memory strength. More generally, our findings reinforce 447 theoretical accounts that emphasize the distinction between memory for 'when' an event 448 occurred versus 'whether' an event occurred 5,6,56,57 . 449 In conclusion, here we show that memory for the temporal context in which an event initially 450 occurred is preserved via the re-expression of activity patterns in human CA1 and ERC. 451 Critically, we show that these dynamics operate across-and support memory at-long 452 timescales (from days to months). These findings complement yet significantly advance existing 453 evidence from rodents and humans implicating the hippocampal-entorhinal system in 454 representing and remembering time. In particular, our findings suggest that distributed patterns 455 of activity in CA1 and ERC encode and reinstate temporal context information, thereby 456 preserving memory for when events occurred. 457 Information on the NSD dataset is available at http://naturalscenesdataset.org. The final memory 477 data will be made publicly available upon manuscript publication. Custom analysis scripts for the 478 current manuscript are available upon request to the corresponding author (F.Z.). 479

METHODS 481
Participants 482 Eight participants took part in the study (two males, six females; age range: 19-32). All 483 participants were right-handed with no known cognitive deficits nor color blindness and with 484 normal or corrected-to-normal vision. Participants were naïve to the experimental manipulation 485 and were not involved in the design nor planning of the study. Informed written consent was 486 obtained from all participants before the start of the study, and the experimental protocol was 487 approved by the University of Minnesota Institutional Review Board. 488

Design and procedure 489
Data used in this study were collected as part of the Natural Scenes Dataset (NSD;  490 http://naturalscenesdataset.org), and included two parts: a continuous recognition phase 491 conducted in the fMRI scanner and a behavioral final memory phase (Fig. 1a). 492

Continuous recognition phase. A detailed description of the continuous recognition phase has 493
been reported in a previous publication 27 . Briefly, for each participant, the continuous 494 recognition phase was split across 40 scan sessions in which 10,000 distinct color natural scenes 495 would be presented three times spaced pseudo-randomly over the course of all scan sessions. 496 Each scan session consisted of 12 runs (750 trials). Distributions of image presentations were 497 controlled such that both short-term and long-term re-exposures were probed (see Stimuli section 498 below). Four of the participants completed the full set of 40 NSD scan sessions. Due to 499 constraints on participant and scanner availability, each of the other four participants completed 500 30-32 scan sessions. In these collected data, each participant viewed 9,209-10,000 distinct 501 images and participated in 22,500-30,000 trials. Each trial lasted 4 s, consisting of the 502 presentation of an image for 3 s and a following 1-s gap. Participants were instructed to perform 503 a continuous recognition task in which they reported whether the current image had been seen at 504 any previous point in the experiment. 505 Final memory phase. At least two days (range: 2-7 days) after completion of the continuous 506 recognition phase, a final memory test was administered outside of the scanner. Participants were 507 not informed about the final memory test in advance. During the final memory phase, 508 participants viewed a subset of old images (220 per participant) from the continuous recognition 509 phase randomly intermixed with novel images (100 per participant) and completed different 510 types of memory probes. The final memory phase consisted of 320 trials, with up to three 511 judgements per trial. Each trial began with a recognition test in which participants performed an 512 old or new judgment with a confidence rating on a scale of 1 to 6 (1: 'high confidence new', 2: 513 'medium confidence new', 3: 'low confidence new', 4: 'low confidence old', 5: 'medium 514 confidence old', 6: 'high confidence old'). For images judged as "old", a frequency test followed 515 in which participants were asked to indicate how many times they had seen each image (1,2,3,516 or 4 or more times). Following the frequency test, participants performed a temporal memory test 517 using a timeline. In this test, participants were asked to indicate, on a continuous timeline with 518 tick marks to represent each session, when in the experiment they thought each image was first 519 encountered (Fig. 1c, right). The length and labels of the timeline vary across participants, 520 depending on how many sessions they completed in the continuous recognition phase. 521 Participants were encouraged to use the full length of the scale, with the left endpoint 522 representing the beginning of the continuous recognition phase and the right endpoint 523 representing the end. Participants used a cone to mark the temporal location on the line and were 524 instructed to indicate their confidence in response via adjusting the size of the cone, with smaller 525 cones representing higher confidence and bigger cones representing lower confidence (see 526 Supplementary Video 1 for depiction of example trials). Given the primary focus of the present 527 study concerns temporal memory precision, we only analyzed the estimates of temporal location 528 as illustrated in Fig. 1c. All tests in the final memory phase were self-paced with a timeout of 30 529 s. 530 Stimuli 531 All images used in this study were taken from the Microsoft Common Objects in Context 532 (COCO) database 58 . 533 Continuous recognition phase. For the continuous recognition phase, a total of 73,000 images 534 were prepared with the intention that each participant would view 10,000 distinct images (9,000 535 unique images and 1,000 shared images across participants) three times each over the course of 536 40 scan sessions. To prevent the recognition task from becoming too difficult (and risking loss of 537 morale), each image was randomly placed three times on a circle according to a probability 538 distribution created by mixing a relatively narrow von Mises distribution and a uniform 539 distribution. Across all scan sessions, the mean number of distinct images shown once, twice, 540 and all three times within a typical session is 437, 106, and 34, respectively. 541 Final memory phase. For the final memory phase, a total of 320 images were used for each 542 participant, including 220 old images viewed in the continuous recognition phase and additional 543 100 novel images from the COCO dataset. All old images used in the final memory phase were 544 selected from the set of images that a given participant saw each image three times during fMRI 545 scanning. There were two additional sets of criteria to select the old images. First, 120 out of the 546 220 old images were selected based on three main criteria: (1) Each image exposure was judged 547 with correct responses in the continuous recognition phase, that is correct rejection, hit, and hit 548 for the first, second, and third exposure, respectively.
(2) To promote the overall temporal 549 memory performance, images were first selected based on the session location of their first 550 exposure, with approximately half of these images were selected from the last eight scan sessions 551 that each participant participated in (it was adjusted to last ten scan sessions for one participant to 552 have enough trials given their performance in the continuous recognition phase), and the other 553 half were selected from the rest of the scan sessions.
(3) For each half, images were then selected 554 based on the spacing between exposures, with one-third with all three exposures within one scan 555 session, one-third with the last two exposures in the same session, and the rest either with the 556 first two exposures in the same session or with three exposures across different sessions. Second, 557 the remaining 100 old images were selected to maximally span semantic space (see the NSD data 558 paper 27 for details). Briefly, this was done by computing shifted inverse frequency sentence 559 embeddings for the sentence captions, and using a greedy approach to determine the subset of 560 100 images that maximize the average distance between each image's embedding and its closest 561 neighbor. 562 In order to equate prior memory outcome with the other images, only old images that received 563 correct responses all three times in the continuous recognition phase were included in further 564 analyses (143-170 images for each participant). 565

MRI data acquisition and preprocessing 566
The imaging data was collected as part of the NSD at the Center for Magnetic Resonance 567 Research at the University of Minnesota. In brief, functional data and a few additional 568 anatomical measures were collected using a 7T Siemens Magnetom passively-shielded scanner 569 with a single-channel-transmit, 32-channel-receive RF head coil (Nova Medical, Wilmington, 570 MA). Functional data was acquired using whole-brain gradient-echo echo-planar imaging (EPI) 571 at 1.8-mm resolution and 1.6-s repetition time. In addition to the EPI scans, for the purposes of 572 hippocampal segmentation, a high-resolution T2-weighted scan was acquired during one of the 573 7T scan sessions. T1-and T2-weighted structural scans were collected using a combination of a 574 3T Siemens Prisma scanner and a standard Siemens 32-channel RF head coil. 575 Functional data were pre-processed by performing one temporal resampling to correct for slice 576 time differences and one spatial resampling to correct for head motion within and across scan 577 sessions, EPI distortion and gradient non-linearities. Two versions of the functional data were 578 prepared: a 1.8-mm standard-resolution preparation (temporal-resolution, 1.333s) and an 579 upsampled 1.0-mm high-resolution preparation (temporal-resolution, 1.000s). The latter 580 preparation exploits the benefits of small head displacements and preserves as much spatial 581 detail as possible 59 . Analyses in the current paper used the 1.0-mm high-resolution preparation of 582 the NSD data. 583 Parameter estimates (beta weights) reflecting fMRI response amplitudes evoked by each trial 584 were estimated using a general linear model (GLM) approach as described in the NSD data 585 paper. We used the beta version 2 from NSD for all the analyses in the current paper. Briefly, the 586 pre-processed time-series data was fitted multiple times with a single-trial GLM, each time using 587 a different hemodynamic response function (HRF) from a library of HRFs. For each voxel, we 588 identified which HRF provided the best fit to the data and used for that voxel the single-trial 589 betas associated with that HRF. Betas were then converted to units of percent BOLD signal 590 change by dividing amplitudes by the mean signal intensity observed at each voxel and 591 multiplying by 100. 592

Regions of interest (ROIs) 593
The medial temporal lobe (MTL) ROIs were manually drawn on the high-resolution T2 images 594 obtained for each participant, following a 7T protocol for segmentation of MTL subregions 60 . 595 Labels were defined on the raw high-resolution T2 volume, and were mapped via an affine 596 transformation to subject-native anatomical space. The MTL ROIs included bilateral CA1, 597 CA2/3/dentate gyrus, entorhinal cortex (ERC), perirhinal cortex (PRC), and parahippocampal 598 cortex (PHC). Example MTL ROIs from one participant were depicted in Fig. 3a. We also 599 included the primary visual cortex (V1) as a control region. The bilateral V1 ROI was manually 600 drawn on cortical surfaces based on results of a population receptive field experiment from the 601 NSD, and were then mapped to volumetric format. Cortical ROIs for the whole-brain parcel level 602 analysis were defined by a multi-modal cortical parcellation from the Human Connectome 603 Project 61 . 604

Behavioral data analyses 605
Overall performance for the temporal memory test was quantified by regressing each 606 participant's subjective estimate of when an image was first encountered against the actual 607 (objective) time (Fig. 2c). Note that there is a general response bias among participants toward 608 the center of the timeline ("raw estimated position", see Supplementary Fig. 2). To account for 609 this response bias and potential non-linearity, the estimated and actual temporal positions used in 610 all analyses in the current paper were converted to ranks according to each individual's marked 611 positions on the timeline and the actual temporal positions in the continuous recognition phase, 612 respectively. To quantify item-wise temporal memory error, we calculated the absolute 613 difference between the ranked estimated temporal position and the ranked actual position (Fig.  614 1e). To test whether each participant had above-chance temporal memory performance, we 615 compared the observed temporal memory error against a null distribution of permutations (1,000 616 iterations), in which the subjective estimates were randomly shuffled across trials for each 617 participant and the temporal memory error was recomputed for each iteration. To facilitate 618 subsequent analyses, for each participant we divided temporal memory trials into 'high-619 precision' and 'low-precision' based on the absolute temporal memory error (median split). 620 To control for temporal lag information and test for relationships between lag and subsequent 621 memory performance ( Supplementary Fig. 3), as illustrated in Fig. 1d, four temporal lags were 622 calculated for each image: the lag between the beginning of the continuous recognition phase and 623 the first exposure (lag 0), the lag between the first and second exposure (lag 1), the lag between 624 the second and third exposure (lag 2), and the lag between the third exposure and the final 625 memory phase (lag 3). The first scan session of the continuous recognition phase for each 626 participant corresponds to Day 0. Because memory is observed to abide by an exponential rule 627 rather than linear time 62 , all temporal lags were quantified by expressing time intervals in 628 seconds and transforming these intervals with the natural logarithm. Lag effects were then tested 629 using mixed-effects regression models with either recognition confidence or temporal memory 630 precision as a dependent variable and with each temporal lag as a separate predictor. 631

Representational similarity analyses 632
Representational similarity analyses were conducted on functional data (single-trial betas) from 633 the continuous recognition phase, and were performed by assessing patterns of neural activity 634 across voxels within each ROI evoked during single trials. Pattern similarity of all possible 635 exposure pairings ( Fig. 3b; r(E1, E2), r(E2, E3), and r(E1, E3)) for each image was computed 636 using Pearson correlation. The resulting correlation coefficients were then Fisher-transformed for 637 further analyses. To avoid potential contamination of similarity from scanner-induced 638 autocorrelation of signals, only correlations between image exposures that occurred across runs 639 were considered (range of the trials excluded for each participant: 12-35). 640

Image-specificity analyses 641
We used two approaches to assess image-specificity in CA1 and entorhinal representations that 642 predicted temporal memory. 643 Intact versus shuffled pattern similarity analysis. Our first analysis tested whether temporal 644 memory precision was predicted by image-specific pattern similarity (restricted to E1-E2 645 similarity) in CA1 and ERC using images tested in the temporal memory test (which were a 646 subset of the full image set). Specifically, we randomly shuffled the E1-E2 mappings within each 647 participant, such that each image's E1 was paired with a different image's E2. We then 648 computed the pattern similarity of these shuffled exposure pairs and the new corresponding 649 temporal lags. The shuffled E1-E2 pattern similarity scores and temporal lag information were 650 then submitted to a mixed-effects logistic regression model predicting temporal memory 651