Population dynamics of head-direction neurons during drift and reorientation

The head direction (HD) system functions as the brain’s internal compass1,2, classically formalized as a one-dimensional ring attractor network3,4. In contrast to a globally consistent magnetic compass, the HD system does not have a universal reference frame. Instead, it anchors to local cues, maintaining a stable offset when cues rotate5–8 and drifting in the absence of referents5,8–10. However, questions about the mechanisms that underlie anchoring and drift remain unresolved and are best addressed at the population level. For example, the extent to which the one-dimensional description of population activity holds under conditions of reorientation and drift is unclear. Here we performed population recordings of thalamic HD cells using calcium imaging during controlled rotations of a visual landmark. Across experiments, population activity varied along a second dimension, which we refer to as network gain, especially under circumstances of cue conflict and ambiguity. Activity along this dimension predicted realignment and drift dynamics, including the speed of network realignment. In the dark, network gain maintained a ‘memory trace’ of the previously displayed landmark. Further experiments demonstrated that the HD network returned to its baseline orientation after brief, but not longer, exposures to a rotated cue. This experience dependence suggests that memory of previous associations between HD neurons and allocentric cues is maintained and influences the internal HD representation. Building on these results, we show that continuous rotation of a visual landmark induced rotation of the HD representation that persisted in darkness, demonstrating experience-dependent recalibration of the HD system. Finally, we propose a computational model to formalize how the neural compass flexibly adapts to changing environmental cues to maintain a reliable representation of HD. These results challenge classical one-dimensional interpretations of the HD system and provide insights into the interactions between this system and the cues to which it anchors.

The head direction (HD) system functions as the brain's internal compass 1,2 , classically formalized as a one-dimensional ring attractor network 3,4 . In contrast to a globally consistent magnetic compass, the HD system does not have a universal reference frame. Instead, it anchors to local cues, maintaining a stable offset when cues rotate [5][6][7][8] and drifting in the absence of referents 5,[8][9][10] . However, questions about the mechanisms that underlie anchoring and drift remain unresolved and are best addressed at the population level. For example, the extent to which the one-dimensional description of population activity holds under conditions of reorientation and drift is unclear. Here we performed population recordings of thalamic HD cells using calcium imaging during controlled rotations of a visual landmark. Across experiments, population activity varied along a second dimension, which we refer to as network gain, especially under circumstances of cue conflict and ambiguity. Activity along this dimension predicted realignment and drift dynamics, including the speed of network realignment. In the dark, network gain maintained a 'memory trace' of the previously displayed landmark. Further experiments demonstrated that the HD network returned to its baseline orientation after brief, but not longer, exposures to a rotated cue. This experience dependence suggests that memory of previous associations between HD neurons and allocentric cues is maintained and influences the internal HD representation. Building on these results, we show that continuous rotation of a visual landmark induced rotation of the HD representation that persisted in darkness, demonstrating experience-dependent recalibration of the HD system. Finally, we propose a computational model to formalize how the neural compass flexibly adapts to changing environmental cues to maintain a reliable representation of HD. These results challenge classical one-dimensional interpretations of the HD system and provide insights into the interactions between this system and the cues to which it anchors.
The HD system, commonly referred to as the neural compass, underlies a navigator's sense of direction 1,2,[11][12][13][14] . In contrast to a traditional compass, the orientation of the HD system is anchored to local environmental cues 8,15,16 . Our understanding of the mechanisms that support the ability of the HD network to align with specific cues and maintain a consistent sense of direction remains limited. Recent research has shown that substantial variability in HD cell activity during sleep cannot be explained by a singular angular dimension 17,18 . This observation challenges the classical view of the internal HD representation as a unidimensional (that is, angular) construct and motivates further investigation of its complexity, from a population perspective, to understand whether extra dimensions are needed to fully capture how the system adapts to unstable conditions (that is, changing, missing and/ or conflicting sensory information) in wakefulness. While updating the internal HD representation requires integration of information from multiple sensory modalities, manipulations of visual cues alone are sufficient to reorient this representation 6,7,[19][20][21] . The visual input exerts a dominant influence on the HD network alignment, probably through a feedback correction that calibrates the integration of angular movements 22 . Computational models of the HD network suggest that plasticity mediates the integration of visual information within the network [23][24][25] , confirmed recently in fruit flies 26,27 but not yet in mammals. Here we characterize the thalamic HD network response to visual manipulations, yielding new insights into the mechanisms that underlie anchoring and calibration of this representation. Furthermore, we provide a network model to assess the plausibility of synaptic plasticity as a mechanism to explain the observed variability in the system's response to externally controlled changes in visual information.
We performed calcium imaging of the anterodorsal thalamic nucleus (ADN) in three mice using a head-mounted endoscope [28][29][30] . Our recordings (n = 102 20 min sessions, with baseline periods of 3, 5 or 10 min depending on the experiment) enabled us to monitor up to 255 ADN cells simultaneously as mice freely explored a small elevated circular platform inside a larger enclosed chamber (Fig. 1a-g and Extended Data Fig. 1a-c). This chamber was composed of a 360° circular LED screen covered by an opaque dome (Fig. 1a). During baseline recordings at the start of each session, we displayed a polarizing vertical white stripe on the otherwise black LED screen. All subsequent testing involved the manipulation of this visual cue.
Calcium imaging data were motion-corrected and spiking activity was inferred from extracted fluorescent transients 31,32 (Extended Data Fig. 1d). Baseline recordings revealed HD cells with preferred firing   Article directions (PFDs) that tiled the full 360° horizontal plane ( Fig. 1g and Extended Data Fig. 1c). Consistent with previous research, ADN neurons were tuned to specific HDs 1,33,34 albeit with higher proportions (Fig. 1c-g and Extended Data Figs. 1 and 2). HD tuning was stable in the presence of visual cues 34 (Fig. 1c and Extended Data Figs. 1d and 2), and exhibited anticipatory firing 35 (Extended Data Fig. 3a). In contrast to the HD system in the central complex of Drosophila 36 , we did not observe topographic organization ( Fig. 1d and Extended Data Fig. 1b).
To infer the internal HD representation (referred to here as internal HD), we trained a Bayesian decoder 37 to estimate the animal's HD on the basis of the baseline training data (Fig. 1i). This decoder accurately recovered the measured HD in stable experimental conditions (median absolute error (MAE) of test data = 5.96°; Fig. 1j).
To visualize the low-dimensional structure of the HD representation, we developed a method to project large ensemble recordings onto a two-dimensional (2D) polar state space (Methods). We trained a deep neural network on the measured head direction while allowing an untrained latent variable to capture variability in the neural data that cannot be explained by changes in the head direction alone. This inferred latent variable constitutes the radial component in the 2D polar state space (that is, secondary dimension). When applied to the baseline data, we obtained a ring-like structure (Fig. 1h), reminiscent of ring attractor models and previous analyses 3,17,38 . This further confirms that the internal HD representation is approximately unidimensional in stable conditions.

Network gain covaries with reset dynamics
To investigate how HD network dynamics enable reorientation, we recorded the HD network during a cue-shift paradigm. After a baseline recording, the cue was removed for 2 min (darkness) and then reappeared at a 90° shifted position for 2 min. We repeated this sequence four times per recording session (Fig. 2a).
This manipulation resulted in predictable changes in the HD network's patterns of activity. To characterize these dynamics, we defined an 'offset' as the mismatch between the measured and decoded HD (see the 'Analysis of drift' section of the Methods). Tracking this offset, we observed a rotational response after cue reappearance that matched the cue shift. We refer to this phenomenon as a reset (Fig. 2a). Notably, resetting events were not homogeneous, as they occurred across a wide range of angles and speeds (Fig. 2b).
Cue shifts induced significant changes in the overall network activity. We observed modulation in the amplitude of the bump of activity (see the 'Reconstruction of the bump of activity' section of the Methods), which coincided with changes in the radius of the latent space (Fig. 2c,d). Intuitively, allowing the internal HD representation to occupy a 2D polar state space makes the distance between any two given angles θ 1 and θ 2 a function of the radial component, which led us to hypothesize that changes in radius not only reflected changes in the overall population activity but would also correlate with changes in the speed of reset (Extended Data Fig. 4). To quantify this, we computed the total population activity normalized to the baseline activity, a measure that we refer to as the network gain (Extended Data Fig. 5; see the 'Calculation of network gain' section of the Methods). State-space radius was highly correlated with network gain (Fig. 2d,e), indicating that gain can be used as an interpretable measure of the radial component. To better understand the relationship between resetting events and network gain, we first analysed the 90°-centred reset range (that is, [70:110]° range). We found that the speed of HD-network reorientation, or 'reset speed', was anticorrelated with network gain (Fig. 2g-i). Separation of the resetting events into two groups (Fig. 2g) revealed that fast resets were associated with a substantial reduction in network gain shortly after cue reappearance, whereas slow resets exhibited a smaller reduction in gain ( Fig. 2h and Extended Data Fig. 6a). In all cases, resetting events took the form of a continuous rotation of the HD representation from an initial orientation to the reset direction, passing by all intermediate angles without the appearance of secondary bumps of population activity (Extended Data Fig. 7a). This reset was slower than what has previously been reported 6 , possibly due to differences in behaviour, habituation and/or the geometry of the testing set-up. Our modelling results show that an attractor network model that incorporated network gain replicated these dynamics with 71% accuracy in classifying fast versus slow resets (Fig. 2j,k and Extended Data Figs. 7b-d).
Behavioural differences as measured by the head angular velocity before and after cue onset could not explain the sharp decrease in gain amplitudes (Extended Data Fig. 8a). However, reduced head angular velocity immediately preceding cue events was predictive of fast resets, and vice versa (Extended Data Fig. 8b,c).
Resetting events also varied in the angular difference between their initial and stabilizing orientations. We grouped resets by the distance between pre-cue offset and the offset after stabilization (Extended Data Figs. 6b and 9a), and found that network gain was anticorrelated with reset range (Extended Data Figs. 6b and 9b,c). This relationship was independent of reset speed (Extended Data Fig. 9a), suggesting that network gain is independently modulated by the estimated error between the internal representation and the actual location of the visual cue.
We also detected a rapid spike in gain at the cue onset that was largest in short-range resets (Extended Data Fig. 9d). This may reflect visual inputs, but further investigation was limited by the temporal resolution of calcium imaging.

HD neurons maintain a trace of the cue
The HD network drifts in the absence of visual cues 5,[8][9][10] . We hypothesized that network gain would decrease once the visual cue is removed due to decreased sensory input. This would bring the HD system to a lower energy state and cause the internal representation to become prone to spontaneous shifts because of the decrease in signal-to-noise ratio of neural activity. During all darkness epochs (D1 to D4), we observed an increase in drift relative to baseline (Fig. 3a,b and Extended Data Fig. 10a), which coincided with an abrupt decrease in the network gain after cue removal (Fig. 3c). Notably, changes in the network gain were dependent on the internal HD during darkness. When the internal HD pointed towards the internal location of the visual cue (0°), the reduction in gain was minimal; deviations from this direction resulted in more pronounced gain decreases (Fig. 3d,e). This gain profile persisted across all darkness periods with the difference between peak and trough increasing from the first to last darkness epoch (Extended Data Fig. 10d,e). Animal behaviour did not significantly affect the gain landscape, except that gain amplitude increased with the absolute head angular velocity ( Fig. 3e and Extended Data Fig. 11), consistent with previous observations 36, 39 . This suggests that the HD network maintains a 'memory trace' of salient visual cues.
Drift patterns were not homogenous across the four darkness epochs (Extended Data Fig. 10a-c). During D1, drift fluctuated around the baseline orientation with no directional bias. By contrast, drift in D2, D3 and D4 exhibited directional biases dependent on the baseline orientation and previous cue location. During D2, drift diverged from its reset orientation towards its baseline orientation, counter to the rotation implied by the previous cue shift. During D3 and D4, drift rotated towards the baseline orientation but consistent with the direction implied by the previous cue shifts. These observations indicate that drift depends on previous visual experience. These predictable drift biases after exposure to the changing visual reference frame (D2 to D4) appeared to consistently bring the HD network closer to its original configuration (that is, the baseline state). Consecutive shifts of the visual cue in one direction further biased the drift in that direction (Extended Data Fig. 10b,c). These results suggest that both the stable allocentric and dynamic visual reference frames exert a persistent influence on the network orientation.

Drift patterns are experience-dependent
The presentation of a rotated visual cue for 2 min was sufficient to cause a representational shift and override the influence of non-visual cues (self-motion, olfactory and so on). Yet, we observed a tendency of the network to rotate back towards the initial configuration, that is, revert to baseline, during darkness. We hypothesized that this could implicate plastic processes through which a 'memory' of baseline state exerts influence on the internal HD representation. If true, drift dynamics might depend on the duration of exposure to the shifted-cue context.

Article
To test this, we limited the display of the rotated visual cue to 20 s (±90° from the baseline; Fig. 4a). These shortened cue events elicited resets followed by reversions towards baseline during darkness (Fig. 4a-c). However, in comparison to the 2 min experiment (specifically D2, which was similarly preceded by a ±90° rotated cue event), reversion was much stronger after the presentation of a 20 s visual cue (Fig. 4c). The attraction of the network to its baseline state was further demonstrated through vector field analysis ( Fig. 4d; see the 'Vector field analysis' section of the Methods). These results indicate that the internal representation of the baseline allocentric reference frame is not entirely lost after a reset and can still influence the HD network in darkness, depending on the duration of experience within the competing reset reference frame context. Addition of Hebbian learning to our model shows that, indeed, HD neurons could form new associations with the unchanged allocentric cues, depending on the duration of exposure to the reset context. Given enough time, the synaptic strength of these new associations increased while old associations were depleted, resulting in a new steady state. In this scenario, our simulations of the internal HD representation showed limited baseline attraction. By contrast, synaptic weights did not change significantly after shorter (20 s) cue exposures, and baseline associations remained dominant, resulting in strong reversions (Extended Data Fig. 12). Next, we examined potential competition between baseline and reset reference frames by comparing network gain patterns after long and short shifted-cue exposures. While the gain landscape during D2 exhibited a single peak at 0° in the internal HD (corresponding to the shifted cue orientation after reset), the gain landscape during darkness after the 20 s cue events exhibited additional peaks at ±90° (Fig. 4e,f and Extended Data Fig. 13). Notably, these peaks matched the alternating ±90° cue structure of the experimental design, suggesting the coexistence of associations between HD neurons and allocentric cues from multiple visual contexts. These differences provide additional evidence of time-dependent effects of visual experience.

Cue rotation causes persistent drift bias
In the 2 min cue-shift experiment, visual information provided a dominant polarizing cue to reset the HD system. In some cases, resets were slow (>30 s), indicating that non-visual cues competed with visual information to stabilize the network. The 20 s cue-shift experiment provided further evidence that the baseline reference frame maintains a persistent influence on the HD network. To better understand the dynamics of this competition, we tested whether visual information could drive resetting when in continuous conflict with non-visual information, including self-motion cues. We recorded HD cell populations during presentation of a slow rotating visual cue (1.5 or 3.0° per s) for 7 min (Fig. 5a,b). In all cases, and for both speeds, the HD network was continuously updated by the visual cue (Fig. 5a-c), highlighting the dominant effect that the visual input has over all other inputs in controlling the HD system. Notably, the HD network continued to rotate in the same direction and at a similar speed when the rotating cue was turned off (Fig. 5a,b,d and Extended Data Fig. 14a; see the 'Analysis of cue-rotation sessions' section of the Methods). This persistent bias was replicated in our network model by adding a recalibration circuit to asymmetrically change the strength of vestibular input through visual feedback (Extended Data Fig. 14b-e and Supplementary Information). We also observed an attraction to the baseline internal representation, similar to our prior experiments. The system started to stabilize once the internal HD representation approached the baseline state ( Fig. 5e and Extended Data Fig. 14a). This phenomenon could also be reproduced in our model in which, after 7 min of cue rotation, no strong new associations between HD neurons and allocentric cues could emerge to form a new steady state. Instead, baseline associations remained dominant, albeit with a significant weight decay in the synaptic matrix (Extended Data Fig. 14d). Together, these results indicate that experience with dynamic reference frames can bias the HD network and implicate asymmetric recalibration of vestibular input integration within the HD network as a potential source of this bias.

Discussion
We combined large population recordings of ADN neurons with visual cue manipulations to characterize the population dynamics of the mammalian HD system. Controlled manipulations of a visual cue induced global fluctuations in network activity captured by a measure that we termed network gain. Network gain represents a functional dimension in the internal HD representation, in addition to its classically appreciated angular dimension. Reorientations, in response to visual cue shifts, were associated with a transient decrease in network gain, the magnitude of which was correlated with the speed of the HD network's realignment with the rotated visual reference frame (that is, reset). By extending a standard model of the HD system 40 to incorporate network gain, we were able to predict the speed of the reset response. These results suggest that modulation of network gain provides the HD system with a mechanism to rapidly reorient. Network gain also reflected the past experience of the systema polarizing visual landmark induced persistent distortions in the network gain profile, forming a memory trace in darkness. These network gain patterns were dependent on the duration of the previous shifted-cue exposure, suggestive of plastic processes in the HD network. Evidence for plasticity was further strengthened by experience-dependent drift behaviours in darkness periods. Incorporating Hebbian plasticity into our network model replicated these observations. Finally, the HD system anchored to a continuously rotating visual cue and continued to rotate after the cue was removed. A model of asymmetric vestibular input recalibration reproduced these results, suggesting that the integration of vestibular information within the HD network is also experience dependent.
Network gain reduction during realignment of the HD network suggests that a feedback signal downstream of ADN provides global inhibition to the network. Similar ideas have been proposed in the central complex of fruit flies 41,42 . Modulation of global neural activity might allow the HD system to operate at different energy levels with varying degrees of stability, reflecting uncertainty in the HD representation. We hypothesize that the animal's engagement in exploratory behaviour together with increased familiarity with the experimental environment and its geometric specificities could sustain a high-gain/high-certainty regime of operation and cause resistance to HD network reorientations imposed by visual cue shifts.
Recent research in fruit flies demonstrated plasticity between visual input and the compass neurons 26,27 . The current study complements these studies and provides evidence for experience-dependent influence of visual landmarks in mammals. Indeed, the mammalian brain appears to maintain a memory of the associations between HD neurons and visual landmarks, in the form of preferential firing, long after landmarks disappear. We propose that memory traces of salient cues in ADN cells help to stabilize the HD system during navigation, even in the absence of reliable environmental anchors by maintaining high activity levels (that is, high signal-to-noise ratio) around internal cue locations. Whether these network gain bumps have an active role in guiding navigation behaviour remains an open question. Moreover, baseline-state attraction during darkness is further evidence of long-term effects in the HD system. As the strength of this attraction depends on the duration of exposure to the shifted-cue context, we speculate that the underlying mechanisms leading to such behaviour involve synaptic plasticity. Through network modelling, we demonstrated that, by adding Hebbian learning, the observed drift patterns could be replicated. Our model proposes that neurons akin to the ring cells of fruit flies could mediate experience-dependent drift dynamics. Whether such neurons exist in the mammalian brain is yet to be determined.
The fact that a continuously rotating visual scene caused persistent biases in the HD representation, as demonstrated in our cue-rotation experiment, suggests that the integration of vestibular inputs undergoes an experience-dependent calibration. Our findings complement a similar finding in place cells 22 and support a model of hierarchical transfer of information from HD neurons to downstream cells of the navigation system (that is, place cells, grid cells and so on) to maintain consistent and flexible cognitive maps 1,41,43,44 . Ultimately, our findings provide insights into the mechanisms that govern realignment and stabilization of the HD network, and how long-term effects of previous experience affect its dynamics. Importantly, these findings highlight the complexity of the internal HD representation and motivate studying this cognitive system in a multidimensional framework. Here we show evidence for the functional importance of the global fluctuations in network activity (that is, gain) as a critical, yet previously underappreciated, dimension. Future studies examining the origins of such fluctuations will be critical to unveil the complete picture of the intrinsic structure of this circuit.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-023-05813-2.

Surgeries
During all surgeries, mice were anesthetized by inhalation of a combination of oxygen and 5% isoflurane before being transferred to the stereotaxic frame (David Kopf Instruments), where anaesthesia was maintained by inhalation of oxygen and 0.5-2.5% isoflurane for the duration of the surgery. Body temperature was maintained with a heating pad and eyes were hydrated with gel (Optixcare). Carprofen (10 ml kg −1 ) and saline (0.5 ml) were administered subcutaneously, respectively, at the beginning and end of each surgery. Preparation for recordings involved three surgeries per mouse. First, at the age of 7-8 weeks, each mouse was injected with 600 nl of the non-diluted viral vector AAV9. syn.GCaMP6f.WPRE.eYFP, sourced from University of Pennsylvania Vector Core. All injections were administered through glass pipettes connected to the Nanoject II (Drummond Scientific) injector at a flow rate of 23 nl s −1 . One week after injection, a 0.5-mm-diameter gradient refractive index (GRIN) relay lens (Go!Foton) was implanted above the ADN (AP, -1.05; ML, 0.8; DV, −3). No aspiration was required. In addition to the GRIN lens, three stainless steel screws were threaded into the skull to stabilize the implant. Dental cement (C&B Metabond) was applied to secure the GRIN lens and anchor screws to the skull. A silicone adhesive (Kwik-Sil, World Precision Instruments) was applied to protect the top surface of the GRIN lens until the next surgery. Then, 2 weeks after lens implantation, an aluminium baseplate was affixed by dental cement (C&B Metabond) to the skull of the mouse, which would later secure the miniaturized fluorescent endoscope (miniscope) in place during recording. The miniscope/baseplate was mounted to a stereotaxic arm for lowering above the implanted GRIN lens until the field of view contained visible cell segments, and dental cement was applied to affix the baseplate to the skull. A polyoxymethylene cap was affixed to the baseplate when the mice were not being recorded to protect the baseplate and lens. After surgery, animals were continuously monitored until they recovered. For the initial 3 days after surgery, the mice were provided with a soft diet supplemented with Carprofen for pain management (MediGel CPF). Screening and habituation to recording in the experimental environment began 2-3 days after the baseplate surgery. The first 3-4 weeks of recordings were used to confirm the quality and reliability of the calcium data while the animal was exploring the environment with different screen displays.

Data acquisition
In vivo calcium videos were recorded with a miniscope (v3; https:// miniscope.org) containing a monochrome CMOS imaging sensor (MT9V032C12STM, ON Semiconductor) connected to a custom data acquisition (DAQ) box (https://miniscope.org) with a lightweight, flexible coaxial cable. The cable was attached to a noiseless pulley system with a counterbalance (placed outside the recording environment) to prevent interference with the recorded animal's movements and to alleviate the weight of the miniscope. The DAQ was connected to a PC with a USB 3.0 SuperSpeed cable and controlled using Miniscope custom acquisition software (https://miniscope.org). The outgoing excitation LED was set to 3-6%, depending on the mouse, to maximize the signal quality with the minimum possible excitation light to mitigate the risk of photobleaching. The gain was adjusted to match the dynamic range of the recorded video to the fluctuations of the calcium signal for each recording to avoid saturation. Behavioural video data were recorded using a webcam mounted above the environment. The DAQ simultaneously acquired behavioural and cellular imaging streams at 30 Hz as uncompressed AVI files and all recorded frames were timestamped for post hoc alignment. Two controllable LEDs (green and red) were added and used for tracking such that, whenever the miniscope was attached to the baseplate, the green LED pointed to the right side of the mouse's head and the red LED pointed to the left side. All other light sources from the miniscope were covered. All recordings took place inside a 360° LED screen (height: 1 m, diameter: 90 cm; Shenzhen Apexls Optoelectronic), at the centre of which we placed a wall-less circular platform (diameter, 20 cm) raised 50 cm above the ground. Mouse bedding was evenly spread over the platform before each recording session. In all recordings, mice were free to move on top of the raised platform. A half spherical dome was used to cover the environment and prevent external light from entering, while it also held the behavioural camera. The experimental environment was designed to maximize circular symmetry, in the absence of any screen display. During habituation, mice were recorded while exposed to a single vertical stripe or no visual display (darkness). These recordings were also used to confirm the quality of tracking the head direction and the cue location, in different conditions. In all experiments in this study, the visual cue refers to a single white vertical stripe (width, 15 cm; height, 1 m).

Data preprocessing
Calcium imaging data were preprocessed before analyses through a pipeline of open-source MATLAB (MathWorks; v.R2015a) functions to correct for motion artifacts 45 , segment cells and extract transients 32,46 , and infer the likelihood of spiking events by deconvolution of the transient trace through a first-order autoregressive model 31 . We wrote a MAT-LAB (MathWorks, v.2015a) program to perform offline tracking of the LEDs and determine, at each frame, the animal's head direction. Another custom-written program was used to estimate the location of the visual cue. Both scripts were incorporated into the preprocessing pipeline.

Data analysis
In this study, neural activity refers to the deconvolved calcium traces as described previously 31 unless specified. The resulting time series (per neuron, per session) correspond to the inferred likelihood of spiking events. A moving average filter of width 3 frames (~100 ms) is then applied on each time series. We refer to the obtained signal as firing activity.

Identification of HD cells
For every identified cell segment (ROI), we construct an HD tuning curve by measuring the occupancy-normalized firing activity within each angle bin (1° per bin) of the horizontal plane (x axis). The tuning curve is circularly smoothed with a moving average filter of width 50°. This enables us to have a better estimate of the angle bin that corresponds to the maximum firing activity of a given neuron's tuning curve, which we will refer to as the PFD. We next construct a stimulus signal for that specific PFD by convolving the measured HD signal (from the behavioural camera) with a narrow Gaussian kernel (mean = PFD, s.d. = 17°) such that for every neuron i: Where, θ HD is the measured HD time series, σ is the s.d. of the Gaussian kernel and, angdiff (a, b) is a MATLAB function that gives the subtraction of a from b, wrapped on the [−π,π] interval. We correlate the stimulus signal with a normalized version of the firing activity to obtain the Pearson correlation coefficient r of each neuron. To determine the threshold value of r above which a cell can be identified as an HD neuron, we used data from ten baseline recordings (3 min) per animal, randomly selected from the reset experiment. We start with a relatively high value r thresh and select all neurons such that r > r thresh . For each neuron, we produce 1,000 shuffles of the firing activity using the MATLAB circshift function (to preserve the temporal correlation of the firing activity signal) at random shifts. We next correlate each shuffled version with the stimulus signal of the corresponding neuron to obtain a distribution of correlation coefficients (3 separate distributions, 1 per mouse). We define r m 95th as the value that corresponds to the 95th percentiles of the distribution, for mouse m. If r r > m thresh 95th , we keep iterating the same procedure while decreasing r thresh by 0.01 until convergence (that is, r r m thresh 95th ≃ ), which constitutes the correlation coefficient threshold to identify HD neurons for mouse m (see Extended Data  Fig. 1d,e for an illustration of the results).

HD decoding from neural data
We trained a recently developed Bayesian decoder 37 to infer the HD direction from the deconvolved calcium responses of the imaged neural population. Noise independence across neurons was assumed. Conceptually, this decoder is similar to the Bayesian decoding method for spike trains as commonly used in the literature 47 , except that we used zero-inflated-gamma distribution to model the stochasticity of the deconvolved calcium responses, instead of Poisson distribution. Our previous results showed that the zero-inflated-gamma model could better capture the noise of the calcium signal and provide better decoding results compared with the Poisson noise model and a few other alternatives. Details of this procedure can be found in section 4 of ref. 37 . Here we smoothed the log-likelihood matrix (rows, angle bins; columns, frames) by iteratively summing the likelihoods over 5 frames (~166.7 ms) centred around the corresponding timestep of each iteration, for each angle bin. Note that, owing to the predominance of HD-tuned neurons among detected cell segments and to avoid selection biases, the neural activity from all ADN cells was used as an input to the decoding algorithm.

Analysis of drift
We define the offset as the angular difference between the measured head direction (θ measured ) and the decoded head direction (θ decoded ): In all analyses involving drift estimation, both measured and decoded HDs were smoothed with a moving average filter of width 20 frames (~667 ms). For the analysis of drift during darkness (except for heat maps), further smoothing was applied to extract the low-frequency component of the signal whereby a moving average filter of width 300 frames (~10 s) was used. In all cases, a simple linear regression was performed on the unwrapped offset signal over a sliding window of 20 frames (~667 ms) to estimate the drift speed at the centre of the regression window (that is, slope of the fitted line). specified. With the exception of Extended Data Fig. 5b-d, the gain was always normalized to the average baseline activity. In Extended Data Fig. 5b-d, the tuning curves represent the average firing rates as a function of the internal HD for the entire recording session.

Gain heat-map analysis
Gain heat maps are 2D matrices in which each pixel p(x, y) is a 2D bin of width 1.5° per s corresponding to the measured angular head velocity and height 1° corresponding to the decoded HD. Pixel p(x, y) represents the mean network gain-across mice and across sessions-within a 2D average window of width [x − 3:x + 3]° per s and height [ y − 15:y + 15]°. A 2D Gaussian filter of s.d. = 15 (15° × 22.5° per s) is then applied. The network gain, the decoded HD and the measured HD were all smoothed with a moving-average filter of width 20 frames (~667 ms) while the measured head angular velocity was approximated by a simple linear regression with a regression window of similar width. To evaluate the significance of the difference between gain heat maps (Fig. 4f), we performed a Wilcoxon rank-sum test to compare, at each pixel, the gain distributions within the 2D window of width [x − 3:x + 3]° per s and height [ y − 15:y + 15]° between darkness epochs of the 20 s experiment and D2 of the 2 min experiment. As we are only interested in the significance of the positive values (indicating the appearance of new bumps), negative values as well as P > 0.001 were marked as NaN.

Drift-speed heat-map analysis
Drift-speed heat maps were generated according to the same approach as for the gain heat maps. However, drift speed was approximated by a simple linear regression with a regression window of width 20 frames (~667 ms). The P-value matrix for drift speed difference between the 20 s experiment and D2 of the 2 min experiment (Extended Data Fig. 13e) was calculated as described above. However, only P values > 0.001 were marked as NaN.

Vector field analysis
The purpose of this analysis is to illustrate baseline attractiveness. We define the state space (y axis, drift-speed (° per s); x axis, drift-angle (°)). We construct a vector field matrix by dividing the x axis into 18 bins of width 20° each within the range [−180:180]°, and the y axis into 20 bins of width 0.03° per s each, within the range [−3:3]° per s. At each bin (x,y), we calculate the mean drift speed and mean drift acceleration across mice and across sessions. The two latter quantities represent the velocity components (u,v) that determine the length and direction of the velocity vector. We assume the vector field has a central symmetry with reference to the baseline point (0,0) owing to the symmetry in the experimental design. We therefore generate an image of the original vector field that is its reflection across the origin. The two versions are then averaged to produce the final 2D vector field. Streamlines are generated using the streamline function in MATLAB. For Figs. 4d and 5e, streamlines were simulated over 1,000 timesteps.

Analysis of cue-rotation sessions
At the beginning of each continuous cue-rotation epoch, the visual cue was displayed at the same location as in the baseline. After cue removal and depending on its previous rotation speed, the cue would have either reached ±180° or ±90° (cue orientation in clockwise-cue-rotation sessions was reflected across the x axis so that the cue ends at ±180° (fast cue-rotation) or −90° (slow cue-rotation)). The offset is therefore expected to start within a close range of these two directions during the second darkness epoch. However, in some cases, drifts during the first darkness epoch were large enough so that the initial anchoring to the rotating cue occurred considerably far from baseline. This caused the drift signal during the second darkness to start further away from the expected location. In Fig. 5d for slow cue rotation, to study the effects across sessions with similar stability during baseline (total n = 44 out of 60).

Dimensionality reduction
It is generally believed that the main function of the HD system is to provide an estimate of the HD at any given time. As most studies of this network, including ours, are conducted while recording the neural activity in animals placed on horizontal planes, it is fair to assume that most of the variability in the activity of HD neural population can be captured by a single variable representing the angle faced by the animal, at an instant t, with reference to a given allocentric reference frame. Indeed, previous studies have shown that, in stable conditions, different dimensionality reduction methods 17,38 would produce a circular manifold that can be fairly approximated in a unidimensional polar state-space with a fixed radius. Nevertheless, a previous study 17 observed that the structure becomes more complex during slow-wave sleep. Our guiding hypothesis is that the intrinsic geometric structure of the neural activity in the HD network lies in a multidimensional state space and that latent variables other than the angular component are needed to explain the variability in spiking data, during non-stable conditions such as resets and drift situations. Here we propose the simplest augmentation to the latent structure by adding a radial component that we expect to indicate instantaneous changes in global firing activity of the HD network. Although we believe that the true intrinsic dimensionality of the HD neural data is higher than two, the current paper mainly focuses on the necessity of at least a second dimension of the HD system during instability.
To test our hypothesis, we developed a deep feedforward neural network that maps the high-dimensional input (neural) data onto the 2D polar space (angular dimension θ and radial dimension R). The network is trained on circular data from the measured head direction. The radial component R is a latent variable that can take any non-negative value. Our previous analyses (not included here) have shown that, although methods such as principal component analysis and Isomap can uncover looped latent structures, these unsupervised learning algorithms tend to produce distorted circles, in the presence of noise, when applied on baseline data (that is, stable condition), which makes the definition of a radius less straightforward and motivates our use of a supervised learning method.
We used a feedforward neural network with three parallel branches. Two of these branches have three fully connected hidden layers (referred to as first and second or B 1 and B 2 , respectively), while the third branch has two fully connected hidden layers (referred to as middle or B m ) (Extended Data Fig. 4d). The input layer receives a N × 1 vector of neural activity from N ADN neurons at time t (both calcium traces as well as firing activity from deconvolved spikes can be fed to the model). The output layer is composed of two units that are the results of multiplying the output g t of the middle branch with the output z 1,t of the first branch, on one hand, and the output z 2,t of the second branch, on the other hand, as illustrated in the diagram of Extended Data Fig. 4d. We trained our model on baseline data. The objective is to find the set of weights W that minimize the distance between the network out-