## Main

We trained five rats on a 2-m-long linear maze with ten water delivery sites or wells (Fig. 1a, Extended Data Fig. 1). The rats were required to visit and lick two given wells alternately to obtain water rewards. The licking of the animal was detected by infrared sensors on individual wells, and water was delivered after the correct well was licked for a fixed amount of time (1 s, 1.5 s or 2 s, consistent across trials in a session). After at least six consecutive correct choices, a new pair of wells started to deliver water, enforcing the updating of goal locations. The rats learned this task over 2 weeks. We implanted a tetrode drive into the ventral and lateral parts of the OFC (Extended Data Fig. 2) and collected data from four rats across 18 sessions, each of which comprised 68–328 simultaneously recorded OFC neurons.

We found that most OFC neurons increased their spiking as the animal approached the goal well, discriminating its location by changing firing rates (Fig. 1b–d). These neurons, however, showed less position-specific firing to the wells that the animal ran over during navigation. These observations were confirmed by plotting firing rates along maze position conjunctively with navigation phase (defined as positional fraction of journey; Fig. 1e–g). As a population, 80.8% of OFC neurons (2,366 of 2,927) exhibited some degree of spatial tuning on the maze (z > 2.57 in spatial correlations compared to shuffled activity), but, in most them (86.9%, 2,056 of 2,366), the spatial tuning was also dependent on navigation phase (P < 0.05 in spatial correlations compared to activity shuffled across different navigation phases; Extended Data Fig. 3). We further found that, during a random foraging task in an open-field arena, OFC neurons conveyed significantly lower spatial information than neurons in area CA1 of the hippocampus (Extended Data Fig. 4). These results together suggest that most OFC neurons exhibit location-selective firing in conjunction with the demand and phase of goal-directed journey.

Next, we asked how accurately OFC neurons represent the well position. We trained a decoder based on linear discriminant analysis (LDA) using the population activity of OFC neurons in the time range from 0.5 s before to 3 s after lick onset. The trained decoder was then applied to predict the well at which the animal was present (‘current’ well). The decoder predicted the current well above the chance from 1.7 s before to 6 s after lick onset (Fig. 1h, Extended Data Fig. 5). When the same decoder was applied to the wells that were passed through by the animal, it showed significantly lower performance (Fig. 1h, Extended Data Fig. 6). This result was also supported by poor performance of a decoder trained on the instantaneous position of the animal during running (Extended Data Fig. 6). We confirmed that the decoding of the current well was possible irrespective of the approach direction of the animal or the starting well (Fig. 1d, i), suggesting that OFC neurons form a largely viewpoint-invariant coding of spatial positions that are approached as navigational goals.

## Persistent goal representation in the OFC

We then asked when the goal well representation develops in the OFC during navigation. We examined the firing rates of individual OFC neurons backwards in time towards the beginning of navigation and discovered that the rates at this time already differentiated the identity of the goal well (Fig. 2a). To confirm this observation at a neural population level, we projected the ensemble activity of OFC neurons on the dimension with maximal goal well separability calculated by LDA and found that the projected activity kept differentiating the next destination throughout navigation (Fig. 2b).

Our findings of the OFC’s goal representation during navigation, as well as its coding of the position of the animal during reward consumption (Fig. 1h), imply a transition of spatial representation in the OFC before navigation onset. To identify such a transition, we took a decoding approach by training a decoder for goal well identity based on neural activity concatenated between the time segments at the beginning and the end of navigation (Methods). In individual trials, the decoder that initially indicated the start well of the trial exhibited an abrupt change in representation to the next goal well around the time of motion onset, which was then largely maintained during the entire journey (Fig. 2c). On the trial average, the activity of OFC neurons represented the well at which the animal was present (current well) until 0.7 s before motion onset (Fig. 2d, left). However, in contrast to the decay of the current well representation, the activity representing the goal well became significant from 1.1 s before motion onset (Fig. 2d, e, Extended Data Fig. 7), reaching a steady peak 2.5 s after the beginning of navigation (Fig. 2d, left, Extended Data Fig. 5). The decoding probability plotted along the positional fraction of navigation confirmed that the goal well was persistently represented throughout navigation (Fig. 2d, right).

To confirm the decoder’s selective representation of goal well over others, we assessed the representations of wells that were passed through by the animal during navigation, particularly the wells that immediately followed the start or preceded the goal along the journey. We found that the decoding probabilities of these wells with the goal well decoder were significantly lower than that of the goal well throughout the course of navigation (Fig. 2d). Additional analyses further confirmed that goal decoding was not due to other task-associated factors (Fig. 2f) and that the transition of representation from the current well to the goal well did not involve sequential activation of neighbouring positions3,10,11 (Extended Data Fig. 8). These results together suggest that the activity of OFC neurons switches their spatial representation from the starting position of the animal to the next destination before navigation onset, subsequently maintaining it throughout navigation.

We further investigated the goal representation in error trials. We applied the goal well decoder trained on correct trials to neural activity at the beginning of error trials and found that it could decode the subsequent incorrect destination of the animal as accurately as goal wells in correct trials (Extended Data Fig. 5). The activity of OFC neurons thus represents the next target well of the animal irrespective of its correctness, reflecting the animal’s decision of goal destination.

## Goal coding is orthogonal to OFC dynamics

Although our decoding analysis indicates a persistent goal representation in the OFC, firing rates of individual neurons changed markedly during navigation (Fig. 1b), implying that the encoding of goal locations in the OFC is not through the convergence of neural activity towards a point attractor but likely by dynamic coding evolving over navigation (Fig. 3a). We implemented a principal component analysis (PCA) to obtain reduced dimensions of neural population activity in the OFC. We found that activity trajectories, averaged over trials based on subsequent goals, exhibited similar dynamics while maintaining separation between each other (Fig. 3b). To understand how the goals of individual trials are embedded in activity trajectories, we applied an LDA-based dimensionality reduction approach to obtain the best projections of population activity for goal well selectivity at different times of navigation (Methods, Extended Data Fig. 9). The goal well separation was largely maintained during navigation, albeit it transformed from a compact to a distributed configuration as the animal approached the destinations (Fig. 3c). We also found that the major axis of goal well separation was nearly orthogonal to the direction of activity trajectories (Fig. 3d), suggesting that goal locations are encoded largely independently of the evolution of dynamics. We, therefore, asked whether the dynamics of individual trials could be modelled independently of their goal selectivity. First, the destination-specific activity extracted by LDA was projected back to the original neuronal dimensions, forming time courses of goal-dependent dynamics by minimising activity irrelevant to goal coding (Fig. 3e, left). We then fitted a first-order linear dynamic model on neural activity trajectories of a 2.5-s duration from motion onset to capture the global trend of dynamics irrespective of destinations (Methods). Finally, the constructed model was fed with the neural activity at motion onset, generating simulated trajectories up to 2.5 s afterwards. We found that the simulated trajectories evolved in a similar manner as the original ones (Fig. 3e, right), which was confirmed by the improvement of goal well decoding over the time course of navigation (Fig. 3f). We further found that our first-order model, trained on correct trials, was also able to simulate the neural activities on error trials, in which the activity evolved to indicate the incorrectly visited destination of the animal (Fig. 3f). OFC neurons thus encode the destination of the animal orthogonally to the ensemble dynamics that evolve, at least in part, deterministically from navigation onset.

## OFC perturbation led to a navigational error

Finally, we asked whether the activity of OFC neurons causally influences the destination of the animal. We first confirmed that pharmacogenetic inactivation of OFC neurons resulted in a significant increase in the animal’s incorrect choices of destination (Extended Data Fig. 10). To address whether the effect of perturbation is specific to ongoing navigation, we injected adeno-associated virus (AAV) encoding the excitatory opsin bReaCh-ES-eYFP12 followed by implantation of optic fibres in the bilateral OFC in three rats (Fig. 4a). We chose the frequency and power of stimulation that elicited reliable spiking in OFC neurons without affecting the motion of the animal (Extended Data Fig. 10). When laser pulses were applied for a 40-s duration, the animals started making more errors immediately after the onset of perturbation (Fig. 4b, c), which gradually subsided after the termination of laser pulses.

We then asked whether the effect of perturbation is stronger during the development of goal representation in the OFC. To explore this possibility, we applied laser pulses of 6-s duration either at motion onset or at lick onset (Extended Data Fig. 10) and found that the pulses applied at motion onset caused more errors in the subsequent navigation of the animal (Fig. 4d). This deficit was largely recovered in the immediately succeeding trial, suggesting that the impairment is not due to general loss of goal memory. The activity of the OFC is thus crucial for determining the next destination of the animal, particularly at navigation onset when a goal representation develops in its ensemble activity.

## Discussion

In this study, we identified the OFC as a brain region that represents the subsequent destination of an animal throughout navigation. Neural activity correlated with the goal-directed trajectories of animals has been previously described in the rodent hippocampus in the form of brief sequential firing among place cells5,6. However, this activity encodes not only a particular location of interest but also its nearby positions due to the sequential nature of hippocampal spatial coding13, and several studies have cast doubt on its role in determining the destination of animals8,9. Studies in human subjects, by contrast, described the activity modulation in the hippocampus that depends on the next destination instructed by a given cue14,15. It is yet unclear whether this modulation represents the goal location itself or its associated instructive cue or how it can be integrated with a hippocampal spatial map to point to the exact goal position. Goal coding in the OFC is different. The activity of OFC neurons encodes accurate well positions on the maze and exhibits an abrupt transition of encoding position from the location of the animal to its destination without relying on external sensory cues. This goal representation, developed before navigation onset, is then maintained throughout navigation without representing nearby positions or trajectories.

The prefrontal cortex has been considered a key area for navigation16. Lesioning of its ventromedial region, including the OFC, has been reported to impair accurate targeting to a destination in humans and rats17,18. The activity of the OFC is modulated during goal-directed motion19,20 or navigational planning15, and we discovered, in this study, that it can form a representation of spatial goals. The decision of navigational goal requires a choice among available positions. This is consistent with a previously suggested role for the OFC in choice decisions based on prior history of choices and subsequent outcomes21,22,23,24. The representation of spatial goal, however, needs to satisfy additional cognitive demand for navigation. Although accurate coding of spatial position requires sensory and proprioceptive inputs, the emergence of goal representation, as well as its persistence during navigation, indicate suppression of these inputs along the goal-directed journey. This resistance of goal representation to incoming inputs appears to be achieved by dynamic coding. Unlike place cells1 or grid cells4, OFC neurons discriminate positions in a dynamic manner, whereby neural activity changes during navigation while optimising the separation of encoded destinations. Dynamic coding of behavioural variables has been described in many brain regions and species25,26,27,28 and is thought to minimize interference between orthogonal neural activity subspaces28,29. Goal coding in the OFC might then be used in downstream circuits to form goal-directed trajectories, enabling animals to navigate from one location to another by relying on a cognitive map.

## Methods

### Surgery, virus injection and drive implantation

Anaesthesia was induced by isoflurane (5% induction concentration, 0.5–2% maintenance adjusted according to physiological monitoring). For analgesia, Buprenovet (buprenorphine, 0.06 mg ml−1; WdT) was administered by subcutaneous injection, followed by local intracutaneous application of either bupivacain (bupivacain hydrochloride, 0.5 mg ml−1, Jenapharm) or ropivacain (ropivacain hydrochloride, 2 mg ml−1, Fresenius Kabi) into the scalp. Rats were subsequently placed in a Kopf stereotaxic frame, and an incision was made in the scalp to expose the skull. After horizontal alignment, several holes were drilled into the skull to place anchor screws, and craniotomies were made for microdrive implantation. The microdrive was fixed to the anchor screws with dental cement, and two screws above the cerebellum were connected to the electrode’s ground. All animals received analgesics (Metacam, 2 mg ml−1 meloxicam, Boehringer Ingelheim) and antibiotics (Baytril, 25 mg ml−1 enrofloxacin, Bayer) for at least 5 d after the surgery.

For tetrode recordings, rats were unilaterally implanted with a microdrive that contained individually adjustable tetrodes made from 17-mm polyimide-coated platinum–iridium (90–10%, California Fine Wire, plated with gold to impedances below 150 kΩ at 1 kHz). The tetrode bundle consisted of 30-gauge stainless steel cannulae, soldered together in circular or rectangular shapes. The drives were implanted in the OFC of the left hemisphere in four rats with the following coordinates and bundle designs: Rat 110 with a 14-tetrode rectangular bundle (anterior–posterior (AP): 2.75–4.5 mm, medial–lateral (ML): 1.5–2.5 mm alongside the anteroposterior axis); Rat 175 with a 28-tetrode rectangular bundle (AP: 2.75–4.9 mm, ML: 1.2–2.7 mm); Rat 182 with a 42-tetrode rectangular bundle (AP: 2.75–5 mm, ML: 1.2–3.0 mm); and Rat 284 with a 42-tetrode circular bundle (AP: 2.75–5.25 mm, ML: 1.0–3.5 mm). Tetrodes were implanted at an initial depth of 2 mm dorsoventral (DV) from the dura and progressively lowered to the final depths of 2.5–4.5 mm. For the recording from neurons in area CA1 of the hippocampus (Extended Data Fig. 4), a circular bundle of 14 independently movable tetrodes was implanted in the right hemisphere (AP: −3.5 mm, ML: 3.5 mm). For the recording from neurons in the OFC in a modified T-maze task (Extended Data Fig. 5d), a silicon probe was implanted in the right hemisphere (AP: 3.5 mm, ML: 2 mm). Experiments began at least 1 week after the surgery to allow the animals to recover.

For optogenetic perturbation of OFC neurons, AAV1-CamKII-bReaCh-ES-EYFP (a gift from K. Deisseroth)12 was injected into three sites in both hemispheres of the OFC (AP, ML and DV in mm: 3, 3, 4.5; 3.5, 2.8, 4.25; 4, 2.5, 4, respectively). The AAV was injected with an infusion rate of 100 nl min−1 using a 10-ml NanoFil syringe and a 33-gauge bevelled metal needle (World Precision Instruments). After injection was completed, the needle was left in place for 10 min. The volume of 500 nl was injected at each site. Two optic fibres (FP400URT, Thorlabs) were implanted with their tips positioned at approximately 500 µm above the OFC of both hemispheres (AP: 3.5 mm, ML: 2.8 mm and DV: 3.25 mm). The optic fibre in the left hemispheres had two tetrodes attached, with their positions advanced approximately 750 µm from the fibre tip to monitor the neural activity nearby. The virus injection and the optic fibre implantation were performed in the same surgery, and experiments started at least 4 weeks after the surgery.

### Behavioural methods

Rats were trained in the 2-m-long linear maze with ten reward wells distributed at an equal distance (20 cm) between each other. The training procedure consisted of three phases. In the first phase, 100 µl of liquid reward (0.3% saccharin) was manually delivered at two specific wells alternately. Most rats learned to lick wells within 2 d of training. In the second phase, rewards were delivered only after the rat licked the correct wells, but, here, a reward was delivered immediately after the animal’s correct lick. The training duration for this phase lasted for 1–7 d, depending on the individual rats. In the final phase, a transition rule was introduced. Once the rat made at least six consecutive correct trials, rewards were delivered in a new pair of wells, which was signalled by LEDs, positioned directly underneath all the ten wells on the maze, together with the delivery of water rewards at the new well pair. These LEDs thus did not give any position information, and the new goal wells were pre-filled with water before the animal’s approach. The LEDs turned off once the animal consumed these rewards. Furthermore, the animal was required to keep licking the correct well for a fixed amount of time, defined as lick threshold, for a reward to be delivered. Of all the 18 neural recording sessions, the lick threshold was set to 2 s for 12 sessions, 1.5 s for one session and 1 s for five sessions, respectively (Extended Data Fig. 1). The lick threshold was set to 1 s for all DREADD-mediated silencing, optogenetic perturbation and modified T-maze experiments. The licking of the animal was continuously monitored by infrared sensors (Turck) equipped on individual wells, and, once the duration of the animal’s licking exceeded a pre-defined threshold, a tone was generated, followed immediately by the delivery of water with a peristaltic pump (Cole-Parmer). The details of licking behaviours are shown in Extended Data Fig. 1f–h, and the difference of lick threshold did not affect the decoding performance significantly (Extended Data Fig. 1i).

The behavioural analyses (Extended Data Fig. 1) started from the first day of phase 3 training, and each session lasted for 30 min. Neural recording sessions were carried out after the animals reached steady levels of behavioural performance (with stable prior block error rates over a period of three consecutive days—usually achieved within 15 d of training). Trials during the transition to a new well pair were discarded from the analyses. Although one of the rewarded wells in one block could be rewarded again in the immediately succeeding block, this did not affect the learning rate of the animal compared to the blocks where both goal wells were changed (Extended Data Fig. 1j). The number of wells used in each recording session was as follows (out of all ten wells): ten wells in one session, eight wells in five sessions, seven wells in eight sessions and six wells in four sessions. The position and head direction of the animal were monitored with two-coloured LEDs on the head stage at the sampling rate of 25 Hz. All the recordings were performed under a minimum-light condition (no light source in the recording room, with only weak ambient light coming from the adjacent room from computer monitors).

For optogenetic experiments, laser pulses (15-ms width at 6 Hz) were generated from a 561-nm DPSS laser unit (Dragon Laser) for a fixed amount of duration of either 40 s or 6 s. The laser power at the fibre tip in each hemisphere was 1.5 mW. The onset of laser pulses was manually triggered based on the behaviour of the animal on the task, and the time stamps of the pulses were recorded. Perturbation experiments were performed after the animals reached steady levels of behavioural performance (observed as stable prior-block error rates over 3 d; Extended Data Fig. 10).

### Histological procedures

Once the experiments were completed, the animals were deeply anaesthetized by sodium pentobarbital and perfused intracardially with saline, followed by 10% formalin solution. The brains were extracted and fixed in formalin for at least 72 h at 4 ºC. Frozen coronal sections were cut (30 µm) and stained using cresyl violet and mounted on glass slides.

### Spike sorting and cell classification

All data processes and analyses were performed with MATLAB (MathWorks). Neural signals were acquired and amplified using two or four 64-channel RHD2164 headstages (Intan Technologies), combined with an OpenEphys acquisition system with the sampling rate at 15 kHz. The signals were band-pass filtered at 0.6–6 kHz, and spikes were detected and assigned to separate clusters using Kilosort31 (https://github.com/cortex-lab/KiloSort) under the parameter settings of the spike threshold at −4 and the number of filters at 2× the total channel number. Each tetrode was independently grouped with ‘kcoords’ parameters, and the noise parameter determining the fraction of noise templates spanning across all channel groups was set to 0.01. The obtained clusters were checked and adjusted manually based on autocorrelograms and waveform characteristics in principal component space, obtaining well-isolated single units by discarding multi-unit activity or noises. Neurons with firing rates less than 0.5 Hz were excluded. Spike times were converted into firing rates, except for the analyses for the open-field experiment (Extended Data Fig. 4) and the conjunctive coding of spatial location and navigation phase (Fig. 1e, Extended Data Fig. 3). The firing rate estimation was performed by convolving spike times by a Gaussian kernel with a bandwidth of 250 ms.

### Cell classification

#### Spatial selectivity

Firing rates of a neuron were assessed at individual spatial bins of 10 cm along the linear maze across trials. For each spatial bin, random sampling was performed 100 times at various epochs of the session, either when the animal was moving (running speed >10 cm s−1) or not moving (running speed <10 cm s−1), obtaining 200 samples of firing rates per spatial bin (Extended Data Fig. 3). By concatenating these samples across the bins, we created the firing rate distributions of 200 pseudotrials along the maze and evaluated the consistency of spatial tuning by computing pairwise dot products between them. The average of the dot products was considered as a representative value of spatial tuning of the cell. For the corresponding null hypothesis, we shuffled the neural activity between spatial bins for individual pseudotrials and calculated the average dot product between them. This entire process of generation of pseudotrials, as well as calculation of the average dot products for the real and shuffled data, was repeated 1,000 times. The difference between the two distributions was quantified as follows:

$$z=\frac{{\mu }_{{\rm{r}}{\rm{e}}{\rm{a}}{\rm{l}}}-{\mu }_{{\rm{s}}{\rm{h}}{\rm{u}}{\rm{f}}{\rm{f}}{\rm{l}}{\rm{e}}{\rm{d}}}}{\sqrt{{{\sigma }_{{\rm{r}}{\rm{e}}{\rm{a}}{\rm{l}}}}^{2}+{{\sigma }_{{\rm{s}}{\rm{h}}{\rm{u}}{\rm{f}}{\rm{f}}{\rm{l}}{\rm{e}}{\rm{d}}}}^{2}}}$$

where µ and $$\sigma$$ denote the mean and standard deviation, respectively. Neurons with z-scores exceeding 2.57 (corresponding to P < 0.01 in a two-tailed distribution) were categorized as spatially selective. To consider a possible directional tuning of a neuron on the maze, we restricted the analysis to the movement direction with a higher mean firing rate for each neuron.

Among cells categorized as spatially selective, we asked whether spatial tuning of these neurons also depends on the phase of navigation. To address this question, each navigation journey was discretized into eight equidistant positional fractions, and the firing rates at individual fractions or phases were assessed together with the absolute positions of the animalon the maze, by forming a firing rate matrix of phase and position (for example, Fig. 1e, Extended Data Fig. 3). To assess whether a neuron encodes phase and position conjunctively, the firing rate matrix was mean centred (the mean navigation-phase-dependent firing rate was subtracted from each column) and assessed for bias in firing rates relative to navigation phases. This bias was estimated by calculating the Frobenius norm of the mean centred matrix, which is defined as the square root of the sum of squared matrix elements. The statistical significance was assessed by calculating a distribution of Frobenius norms from 1,000 shuffled datasets among eight navigation phases. Neurons with the Frobenius norms exceeding the 95th percentile of the shuffled distribution were considered to encode position and navigation phase conjunctively.

### Two-dimensional firing rates and spatial information calculation

The arena (120 × 120 cm for OFC or 100 × 100 cm for CA1) was divided into 5 × 5-cm spatial bins, and the number of spikes and the overall time spent within individual bins during motion (>7.5 cm s−1) was calculated. The firing rate at each bin was estimated using an adaptive smoothing technique that optimizes the tradeoff between spatial resolution and sampling error32. In brief, for each spatial bin, an expanding circle was constructed until the following criterion was satisfied:

$$r > \frac{\alpha }{{n}_{{\rm{o}}{\rm{c}}{\rm{c}}}\sqrt{{n}_{{\rm{s}}{\rm{p}}{\rm{i}}{\rm{k}}{\rm{e}}{\rm{s}}}}}\,$$

where $$r$$ is the radius of the circle in bins, $${n}_{{\rm{o}}{\rm{c}}{\rm{c}}}$$ is the number of samples occupied within the radius $$r$$, $${n}_{{\rm{s}}{\rm{p}}{\rm{i}}{\rm{k}}{\rm{e}}{\rm{s}}}$$ is the number of spikes within the radius and $$\alpha$$ is a constant set to 200,000. Our positional sampling was interpolated to 1-ms resolution. Hence, $${n}_{{\rm{o}}{\rm{c}}{\rm{c}}}$$ was the number of milliseconds the animal spent within a circle of radius r centred at the bin. Firing rate (spikes per second) in a given bin was calculated as 1,000 × $${n}_{{\rm{s}}{\rm{p}}{\rm{i}}{\rm{k}}{\rm{e}}{\rm{s}}}$$/ $${n}_{{\rm{o}}{\rm{c}}{\rm{c}}}$$. Spatial information for individual neurons in the OFC and CA1 was obtained from the rate maps using the following formula52:

$${\rm{S}}{\rm{I}}=\mathop{\sum }\limits_{i=1}^{N}{p}_{i}\frac{{\lambda }_{i}}{\lambda }{{\rm{l}}{\rm{o}}{\rm{g}}}_{2}\frac{{\lambda }_{i}}{\lambda }$$

where $$N$$ is the total number of spatial bins, $${p}_{i}$$ is the probability of occupying the ith bin, $${\lambda }_{i}$$ is the firing rate in the ith bin and $$\lambda$$ is the overall average firing rate of the neuron. The same formula was used to calculate spatial information of OFC neurons on the linear track.

### Well selectivity

The neuron’s selectivity for goal well was assessed based on its firing rates for each of 100-ms bins in the time range of −0.5 s to 2 s relative to motion onset of navigation, whereas the selectivity for the animal’s licking well (or current well) was assessed from its firing rates in the time range of −0.5 s to 2 s relative to the animal’s lick onset. To account for potential confounds of direction-specific firing, we used a two-way ANOVA with the well identity and the direction of the animal’s approach as two independent variables and the firing rate as a dependent measure. We used the ‘anovan’ function of MATLAB and used the type-II sum of squares for individual variables. Based on the P values for the well identity across all time points, we assessed the neuron’s selectivity to goal well and current well independently (a neuron can be categorized as both goal well and current well selective).

For the decoding analysis in Figs. 1, 2, we pre-selected neurons for a decoder based on a criterion of P < 0.05 at least in one of the time points in the range of −0.5 s to 2 s relative to the onset of either motion or licking. This procedure excluded neurons that were non-selective for the well identity, reducing the number of uninformative dimensions. For the visualization of neural activity trajectories in PCA-based reduced dimensions in Fig. 3b, we used a more stringent criterion of P values less than 0.01 over at least five consecutive time bins (500 ms) for the goal well selectivity.

Although the well selectivity was separately assessed for the current well or the goal well, we found that 83.03 ± 1.37% of the goal-well-selective neurons (by the criterion of P < 0.05) were also current well selective, and 69.38 ± 2.55% of the current-well-selective neurons were also goal well selective, suggesting overlaps of the two populations (Extended Data Fig. 3).

### Decoding analysis

We applied a decoder based on LDA that assigns individual class probabilities by setting class boundaries between multivariate Gaussian distributions fitted to data. In brief, a dataset from each recording session was divided into a training dataset and a test dataset, and a decoder was constructed from the training dataset by employing multiclass one-versus-one LDA using the ‘fitcecoc’ function of MATLAB with a regularization factor of 0.5 to reduce overfitting. We used uniform priors for all decoders. Next, we used the ‘predict’ function of MATLAB to obtain decoding probabilities of individual wells from the test dataset. This function uses an algorithm described by Hastie and Tibshirani33 to compute posterior probabilities from the pairwise conditional probabilities obtained using multiclass one-versus-one decoders. The trials during transition phases to new well combinations were excluded, and only correct trials were used for the decoder’s training. The unvisited wells in each session were excluded in the calculations of both decoding performance and its corresponding chance level. A population of neurons used for a respective decoding analysis for current well or goal well were pre-selected based on their well selectivity (using the method described in the previous section) because this procedure improved a decoder’s performance with better generalization to test data (Extended Data Fig. 7c, h), which is likely due to the reduction of unnecessary dimensions from uninformative neurons. For cross-validation of decoding performance, the training data of a decoder comprised all trials except the trial tested with the decoder as well as the one prior to this trial (that is, leave-two-out cross-validation). Additional details specific to each analysis are described in the following sections.

### Current well decoding

In the decoding analysis of the animal’s licking well (Fig. 1h, i, Extended Data Fig. 5), the data used for the training of a decoder comprised firing rate vectors of neurons (pre-selected based on their current well selectivity) at individual 100-ms bins in the range of −0.5 s to 3 s relative to lick onset, resulting in 36 rate vectors for the class label of licking well. This relatively long range of data (−0.5 s to 3 s) was chosen for a better generalization of well decoding over licking time (Extended Data Fig. 5j). Then, by using this decoder, we obtained the decoding probabilities of individual wells for all the 100-ms bins from −3 s to 6 s relative to lick onset (Fig. 1h, left) or from the beginning (motion onset) to the end (lick onset) of navigation (Fig. 1h, middle). For computing the decoding probability of the well that was run over by the animal, we restricted the analysis on trials when the animal’s running speed at the well exceeded 20 cm s−1 in a 500-ms window.

As a control analysis of decoding (Fig. 1h), we tested whether the well decoding depends on the direction of the animal’s approach (Fig. 1i, left). We trained a decoder from the data in which particular wells were approached only from one side of the linear maze and then tested the decoding performance when the animal approached the same wells from the other direction. We ensured that the decoder was trained with more than ten trials in which the target well was approached from one direction.

As another control analysis (Fig. 1i, right), we tested the possibility that the well decoding might depend on its paired wells in individual trial blocks. For this aim, we assessed the decoding performance of the wells when they are approached from newly paired wells. We trained a decoder with the data that excluded a trial block of a particular well combination but included the blocks in which the same wells were approached from other paired wells. We then tested the decoding performance of the wells approached from the pairs not used in the decoder’s training. The motivation behind this analysis is that, if the well identity is encoded by OFC neurons based on its spatial location, it should be decoded irrespective of its paired wells (or the animal’s start positions). The decoding was performed only when the target well was approached by the animal more than ten times in the training dataset.

### Goal well decoding

For the decoding of the animal’s goal well, we constructed a decoder based on the assumption that the goal well should be represented with the same pattern of neural activity between the beginning and the end of navigation (Fig. 2b). We thus trained the decoder from the data concatenated across two time ranges around motion onset and lick onset. We found that a dimensionality reduction procedure of the neural activity by PCA improved the subsequent decoding performance (Extended Data Fig. 7), likely because this decoding strategy entailed the construction of high-dimensional hyperplanes by concatenating two different time phases of the neural activity, and a dimensionality reduction procedure helped to constrain the hyperplane in a small number of crucial dimensions, thereby improving generalization of the decoder. Before implementing PCA, we used a soft-normalization technique described by Churchland et al.34 to adjust the range of firing rates across the neural population that were pre-selected based on their goal well selectivity (with the method described in the previous section). We then selected PCA dimensions that explain 85% of the data variance across the entire time duration of a recording session, obtaining the neural population activity in reduced dimensions. For each trial, vectors of the population activity in 100-ms bins were concatenated in the time range of −0.5 s to 0.5 s relative to motion onset, together with that of −0.5 s to 1 s relative to the subsequent lick onset at the destination, forming 27 vectors with the class label of goal well. These time ranges were chosen to capture the neural dynamics from the beginning to the end of navigation (Extended Data Fig. 7f).

The decoding was performed on the test dataset in the time range of either from −2 s to 2.5 s relative to motion onset (Fig. 2d, left) or from 1 s before motion onset to the subsequent lick onset at the goal well (Fig. 2d, right). The trials in which the goal wells were immediately adjacent to the animal’s current wells were excluded from the analysis.

### Chance level calculation

We tested a possibility that the goal well decoding could be explained by the neural activity encoding a task-relevant parameter other than the spatial position of goal well. We calculated five chance levels for goal well decoding, each of which corresponds to a specific null hypothesis (Extended Data Fig. 5).

We first tested the possibility that the goal well was not decoded based on its own identity. This possibility was tested by assessing the decoding performance when the well identities were exchanged by shuffling the class labels of training datasets.

We next asked whether the observed goal decoding can be explained by the animal’s running direction, speed, acceleration or trajectory length. To test these null hypotheses, we divided the training dataset into multiple groups. For testing the effect of running direction, we split the trials into two groups, each containing trials with the same running direction on the linear maze. Similarly, for testing the effect of trajectory distance, we divided the trials into groups of different trajectory lengths measured in terms of the number of wells between animal’s current and goal location; for testing the effect of running speed or acceleration, the trials were categorized into two groups (split across the median; analysis with quartile splits was also performed in Extended Data Fig. 5) according to the animal’s running speed or acceleration at motion onset. We then trained a decoder based on the training dataset with the class labels shuffled within individual groups. This procedure provides an estimate of how much well decoding can be possible with the neural activity difference resulting from a given behavioural parameter (without using precise well labelling for the decoder’s training), serving as an additional chance level.

The chance level calculation across all the sessions was implemented as follows. We first performed the decoding of all trials in a session using a decoder with shuffled class labels (as described above) and took the mean of decoding probability of the goal well. This process was repeated 100 times, resulting in a shuffled goal decoding distribution in each session. Examples of goal decoding from individual sessions and their corresponding chance levels (defined as 95th percentile of the corresponding shuffled distribution) are included in Extended Data Fig. 5. The subsequent computation of chance level across all the 18 sessions can intuitively be considered as a procedure to obtain a distribution of the means of 18 independent random variables. We randomly chose one sample from each of the 18 shuffled goal decoding distributions (with 100 samples each) and took their average, obtaining a representative of the session-averaged shuffled decoding probability of the goal well. This procedure was repeated 1,000 times to obtain a distribution of the means of shuffled goal decoding probability across the sessions. The chance level was set at the 95th percentile of the distribution.

The individual chance levels are depicted in Fig. 2f. To calculate the significance level of the decoding analysis in Fig. 2d, we used an aggregate chance level by taking the maximum of the five chance levels at each time point. For the decoding analysis of the animal’s licking well (Fig. 1h, i), we used only two null hypotheses by excluding the ones for the animal’s running speed, acceleration and trajectory length, as they are relevant only when the decoding includes a navigation phase.

### Supervised dimensionality reduction with LDA

LDA was applied for a dimensionality reduction procedure in Figs. 2b, 3c, 3e, f and Extended Data Fig. 9. In contrast to an LDA-based decoder that calculates class boundaries (described in the previous section), the LDA-based dimensionality reduction technique searches for a subspace onto which the projected data exhibit the best separation between categories. The detailed procedures of data matrix manipulations are described step-by-step as follows.

LDA is a supervised linear dimension reduction technique that computes a subspace with the maximum linear separability of data according to class labels. Formally, for C classes, LDA computes at most C − 1 eigenvectors corresponding to the eigenvalues of

$${({{S}}_{{\rm{w}}})}^{-1}{{S}}_{{\rm{b}}}$$

where $${{S}}_{{\rm{w}}}$$ is the average within-class covariance matrix, and $${{S}}_{{\rm{b}}}$$ is the covariance matrix of class means relative to the mean of all classes. Projecting the data on the subspace constructed by these eigenvectors results in the data with reduced dimensions by maintaining the maximum linear separability between classes. A subspace was calculated at each time point without concatenating the data over time for the analyses in Fig. 3c, e, f and Extended Data Fig. 9.

In the analysis in Fig. 2b, we projected the neural activity on the first LDA dimension (corresponding to the largest eigenvalue) to show the target-well-specific activity during the navigation. To find the common LDA dimensions across navigation, we used the neural activity data around the times of both motion and lick onsets of navigation (the same approach as the goal well decoding described in the previous section).

This procedure was also used to construct goal-well-specific neural trajectories by reducing goal-irrelevant activity (Fig. 3c, e, f, Extended Data Fig. 9). First, we carried out a general de-noising step by projecting neural activity to PCA dimensions that explain 85% of data variance (identical to the step described in the goal decoding section). Next, we applied the LDA-based dimensionality reduction procedure at individual time points of navigation. However, due to high-dimensional input data with a small sample number, LDA might overfit the subspaces resulting in poor generalization. We thus took two approaches to prevent this problem: regularization and cross-validation. For the regularization, we calculated the eigenvectors of the following matrix with a regularization factor:

$${({{S}}_{{\rm{w}}}+\lambda {I})}^{-1}\times {{S}}_{{\rm{b}}}$$

where $${I}$$ is the identity matrix, and $$\lambda$$ is the regularization factor set to 1 (different values of $$\lambda$$ are tested in Extended Data Fig. 9). For the cross-validation procedure, we estimated LDA subspaces at individual time points of a particular trial from the training dataset excluding this trial (that is, leave-one-out cross-validation).

Because this procedure generated different subspaces (or axes) for individual trials, we projected the activity in the subspaces back to the original neural space common to all trials. For example, supposing that the data comprised d-dimensional neural data with C classes, the processed neural activity at a given time point of a trial was computed by using the following formula:

$${{\bf{x}}}_{{\rm{p}}{\rm{r}}{\rm{o}}{\rm{c}}}=({{\bf{x}}}_{{\rm{o}}{\rm{r}}{\rm{i}}{\rm{g}}}-{{\boldsymbol{\mu }}}_{{\rm{t}}{\rm{r}}{\rm{a}}{\rm{i}}{\rm{n}}})\times {M}{{M}}^{+}+{{\boldsymbol{\mu }}}_{{\rm{t}}{\rm{r}}{\rm{a}}{\rm{i}}{\rm{n}}}$$

where $${{\bf{x}}}_{{\rm{o}}{\rm{r}}{\rm{i}}{\rm{g}}}$$ is a 1 × d vector of the original neural population activity, $${{\boldsymbol{\mu }}}_{{\rm{t}}{\rm{r}}{\rm{a}}{\rm{i}}{\rm{n}}}$$ is a 1 × d vector of the mean neural activity of the training dataset, $${M}$$ is a d × (C  1) matrix representing a transformation to the subspace computed by the regularized LDA based on the training dataset, $${{M}}^{+}$$ is the pseudo inverse of $${M}$$ and $${{\bf{x}}}_{{\rm{p}}{\rm{r}}{\rm{o}}{\rm{c}}}$$ is a 1 × d vector of the processed neural activity. This entire procedure resulted in de-noising of neural signals according to LDA-based classification while maintaining the number of input dimensions (illustrated with examples in Extended Data Fig. 9).

### Linear modelling of neural dynamics

A regularized first-order linear dynamic model was used to simulate the neural activity dynamics during navigation (Fig. 3e, f). Modelling of a linear dynamic system can be considered a multiple linear regression problem in the following form:

$$\dot{{X}}={X}{A}$$

in which the matrix $${A}$$ transforms the activity vector to the corresponding velocity vector. The regularized matrix $${A}$$ can be obtained with the following calculation:

$${A}={({{X}}^{T}{X}+\mu {I})}^{-1}{{X}}^{T}\dot{{X}}$$

where $${X}$$ is a data matrix with the activity at different times or trials in the row and the neuronal identities in the column, $$\dot{{X}}$$ is the time derivative of $${X}$$, $${\boldsymbol{\mu }}$$ is a regularization factor set to 5 (different values of $${\boldsymbol{\mu }}$$ were tested in Extended Data Fig. 9) and $${I}$$ is the identity matrix. For example, in the dataset with p trials, T time bins and d neurons, the matrix $${X}$$ is created by concatenating all × T data points, resulting in a pT × d matrix. Time-derivative components $$\dot{{{\bf{x}}}_{t}}$$ in the matrix $$\dot{{X}}$$ were computed as follows:

$$\dot{{{\bf{x}}}_{t}}={({\bf{x}}}_{t+1}-{{\bf{x}}}_{t-1})/2$$

where xt + 1 and xt  1 are the activity vectors at the time step of + 1 and − 1, respectively. The neural data used for model construction was pre-processed with the LDA-based de-noising approach described in the previous section. To account for non-linear neural trajectories with linear models, we fitted a linear dynamic model at every 500 ms of the neural data. Individual trajectories were simulated using the following equation in an iterative form:

$${{\bf{x}}}_{t}^{{\rm{s}}{\rm{i}}{\rm{m}}}={{\bf{x}}}_{t-1}^{{\rm{s}}{\rm{i}}{\rm{m}}}+{{\bf{x}}}_{t-1}^{{\rm{s}}{\rm{i}}{\rm{m}}}\times {A}$$

starting with the neural activity at motion onset:

$${{\bf{x}}}_{1}^{{\rm{s}}{\rm{i}}{\rm{m}}}={{\bf{x}}}_{{\rm{m}}{\rm{o}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}\,{\rm{o}}{\rm{n}}{\rm{s}}{\rm{e}}{\rm{t}}}^{{\rm{d}}{\rm{a}}{\rm{t}}{\rm{a}}}+{{\bf{x}}}_{{\rm{m}}{\rm{o}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}\,{\rm{o}}{\rm{n}}{\rm{s}}{\rm{e}}{\rm{t}}}^{{\rm{d}}{\rm{a}}{\rm{t}}{\rm{a}}}\times {A}$$

where $${{\bf{x}}}_{t}^{{\rm{s}}{\rm{i}}{\rm{m}}}$$ is a simulated neural activity vector at time t (relative to motion onset), and $${{\bf{x}}}_{{\rm{m}}{\rm{o}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}\,{\rm{o}}{\rm{n}}{\rm{s}}{\rm{e}}{\rm{t}}}^{{\rm{d}}{\rm{a}}{\rm{t}}{\rm{a}}}$$ is the neural activity population vector at motion onset. We took a leave-one-out cross-validation strategy, in which all the parameters for modelling, de-noising and dimensionality reduction were obtained from the training dataset that excluded a test trial simulated by a model.

Goal decoding of the original and the simulated neural trajectories (Fig. 3f) was performed with the LDA-based decoding procedure described in the previous section, except that the decoders here were trained based on the de-noised neural activity from −0.5 s to 0.5 s relative to lick onset at the goal well. This narrow duration of 1 s was chosen to capture a snapshot of goal representation at lick onset without generalising over time.

### Statistical procedures

All statistical tests were two sided and non-parametric unless stated otherwise.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.