Sleep is critical to hippocampus-dependent memory consolidation1,2,3. Analyzing hippocampal ensemble spike data during both slow-wave sleep (SWS) and rapid-eye-movement (REM) sleep has been an important yet challenging research topic4,5,6,7,8,9,10. During awake active exploration, hippocampal pyramidal cells exhibit localized spatial tuning11. During sleep, in the absence of external sensory input or cues, the network is switched into a different state that engages in internally-driven computation. An important hallmark of sleep, the hippocampal sharp wave (SPW)-ripples, lasting between 50 to 400 milliseconds, is typically accompanied with an increased hippocampal network burst and population synchrony of pyramidal cells1. A central hypothesis is that the hippocampus and neocortex interact with each other during SPW-ripples12 and that hippocampal neurons fire such that the information transferred to the hippocampus during previous awake run behavior is reactivated at a fast timescale during SPW-ripple bursts, encoding information of spatial topology of familiar or novel environments and goal-directed behavioral paths10,13,14,15,16,17,18,19. During run behavior, hippocampal place cells fire in sequences that span a few seconds as animals run through location-dependent receptive fields. During sleep, the same place cells fire in an orderly manner at a faster timescale within SPW-ripple events. While some sequences have been shown to reflect temporally-compressed spatial sequences corresponding to previous experiences by the rat8,9,10,18,19, the spatial content of a large fraction of SPW-ripple events remains unknown. Therefore, uncovering the neural representation of hippocampal ensemble spike activity or spatiotemporal firing patterns during sleep becomes critical for improving our understanding of the mechanism of memory consolidation and, in general, information processing during sleep.

To date, several statistical methods have been developed to analyze sleep-associated hippocampal ensemble spike activity, including pairwise correlation4,5, template matching15, sequence ranking8,9,20 and Bayesian population decoding21,22,23,24. A few observations of sleep data analysis are noteworthy. First, the SPW-bursts during sleep are sparse (low occurrence) and individual events are statistically independent. Second, the magnitude of neuronal population synchrony, measured as the spiking fraction of all recorded neurons during each network burst, follows a lognormal distribution: strongly synchronized events are interspersed irregularly among many medium and small-sized events25. Third, different brain states or experiences may induce changes in firing rate and firing timescale15,26,27. Fourth, there is no ground truth or behavioral measure. The pairwise correlation method ignores the spiking information at fine timescales and population synchrony; the template matching and sequence ranking is more sensitive to exact spike timing order and the number of active neurons. In contrast, Bayesian population decoding methods are more suited to tackle these issues in the presence of large neural ensembles16,17,18,23. However, to our knowledge, there is no precedent for a systematic investigation of these issues using any of these methods.

In this work, we investigate these important statistical issues in greater detail by applying two neural population decoding methods to rat hippocampal ensemble spike data recorded in different states. One decoding method is based on topographic or receptive field representations21,22, while the other is based on topological representation without a priori measure of place receptive fields28,29,30. We first create “synthetic” sleep data by binning and resampling spike trains obtained during active locomotion to simulate important factors that characterize SPW-ripple events and then compare the resulting decoded spatial representations to the animal’s actual run trajectory. This allows us to test two important questions of hippocampal population codes related to sleep and memory replay: representation power (“how reliably is the spatial environment represented?”) and detection power (“how can one detect significant spatial or behavioral state sequences?”). We use rat hippocampal ensemble recordings in two- and one-dimensional spaces to investigate these questions separately and we further compare the performance of topographic vs. topological representation-based decoding methods to SPW-ripple associated spike data.



We analyzed five datasets (Table 1) derived from experimental hippocampal ensemble spike data, recorded from multiple Long-Evans rats under different environments, behaviors and brain states. The animals’ behavioral trajectories from Datasets 1 to 4a are shown in Supplementary Fig. 1. To analyze rat hippocampal ensemble spike data, we considered two model-based Bayesian decoding methods based on different statistical assumptions (Methods, Supplementary Fig. 2). One decoding method is based on topographic or receptive field representations (termed DecodewRF—population decoding method using neuronal receptive fields, see Supplementary Fig. 3). The other is based on topological representation that aims to discover latent structures of sequential or spatiotemporal pattern of activity of cells without the assumption of behavioral measures (termed DecodewoRF—population decoding method without using neuronal receptive fields). The first method is supervised in that it requires training data for constructing place receptive fields in the encoding phase. The second method is purely unsupervised, which is developed based on an m-state hidden Markov model (HMM), with an inherent m × m state-transition matrix P.

Table 1 Summary statistics of ensemble recordings in the rat hippocampus.

Sleep-associated hippocampal ensemble spike data are characterized by several important features: (1) shorter epochs (separated by periods of non- or low-spike activity); (2) small active cell ratio within each epoch; (3) different timescales from behavior. One fundamental assumption is that many sleep-associated hippocampal ensemble spikes preserve the order of temporal firing sequences experienced in behavior. In the following analyses, we first created “synthetic” sleep-like hippocampal ensemble spike data (derived from awake run behavior) and systematically investigated the issues of the length of data epochs, the number of participated neurons, temporal bin size and spike rate. The use of synthetic data allowed us to quantitatively assess the representation power (or decoding accuracy) in hippocampal ensemble representations. We then extended the analyses to experimental sleep data in complete absence of behavior measure and assessed the question of detection power. All reported statistics are shown in mean ± SEM.

We used two established criteria for quantitative assessment: one is the decoding error with respect to the animal’s position and the other is the weighted correlation17,18 and the associated Z-score or equivalent Monte Carlo P-value of detected significant replay events (Methods). The first criterion, which assesses the representation power (i.e., how does the population spike activity reliably represents the environment, ref. 29), was tested on two-dimensional environments (Datasets 1 and 2, see an illustration in Supplementary Fig. 4). The second criterion assesses the detectability issue (Datasets 3, 4a and 4b, ref. 31).

Impact of random splitting

Unlike awake behavior, hippocampal neuronal populations fire in a sporadic manner during sleep, either within or outside the period of SPW-ripples. During awake run behavior, rat hippocampal ensemble spike data were binned with a temporal bin size of Δ = 250 ms into T discrete bins (i.e., TΔ corresponds to total recording time). We applied a speed filter of 15 cm s−1 to exclude immobile periods. As a first step to create sleep-like data structure (Supplementary Fig. 5A), we evenly split the run-associated ensemble spike into epochs. Each epoch was comprised of bins per epoch (bpe) and provided an independent measurement for further statistical analysis. Within each epoch, the temporal order of spiking sequences within cell assembly was preserved or reversed (with equal probability 0.5). The special case when and T0 = T bpe corresponds to the run-associated spike data; when T0 = 1 bpe, all spike bins are independent. Generally, the greater the T0 value, the more temporal information is available within each epoch (which are used to infer the state-transition matrix P in DecodewoRF). In analogy to sleep, T0 = 10 bpe roughly reflects the typical number of temporal bins of 200-ms hippocampal ripple-associated spike data with 20 ms bin size.

Using all available neuronal ensemble spike activities from Datasets 1 and 2, we systematically varied T0 and computed the median decoding error (mean ± SEM). At each T0 configuration, analyses were repeated n = 50 independent Monte Carlo runs, with each run encountering different realization of simulated data. Assuming no temporal prior, the decoding performance of DecodewRF remained unchanged for varying T0 (horizontal dashed line, Fig. 1A,B). This is because the receptive filled is computed based upon the average spike activity over the entire or part of the behavioral episode. Once the receptive field is identified and the likelihood model is fixed, the temporal information becomes irrelevant for estimating the position at each temporal bin. In contrast, the population representation capacity and decoding accuracy of DecodewoRF changed as a function of T0. Our analysis suggested that the mean decoding error (green curves, Fig. 1A,B) was relatively stable with varying T0 < T, but the result variability within the same T0 configuration was relatively high (except for T0 = T bpe). The source of variability was contributed by at least two factors: First, because of random data splitting, breaking the temporal relationship in a spike train also destroy the spatial-temporal relationship (i.e., spike patterns with respect to animal’s run behavior during those periods). For instance, a given position is associated with different spike patterns that depend on the actual trajectory leading to it, such as animal’s heading, speed and previous location. Second, the intrinsic Monte Carlo optimization nature of DecodewoRF induces additional variance (e.g., slow convergence of Markov chains)30.

Figure 1
figure 1

Illustration and decoding performance of population decoding methods.

(A,B) Box plots of median decoding error from DecodewoRF with varying values of T0 (bins per epoch), for Datasets 1 and 2, respectively. The green curves are the averaged median decoding error. The median decoding error of DecodewRF was independent of T0 (horizontal dashed line; 7.02 for Dataset 1, 7.73 for Dataset 2). Representative examples of inferred state-transition matrix (C) from DecodewoRF and the derived topology graph (D) from Dataset 2 (dark color represents high connectivity strength). The percentage of nonzero entries in (C) is 14.8%. (E) Histogram of nonzero connectivity strengths (Pij + Pji, i ≠ j) for panel (C) (mean: 0.23; SD: 0.32).

The inferred number of states m derived from DecodewoRF was relatively stable (m [33, 37] for Dataset 1; m [46, 53] for Dataset 2). As a qualitative assessment, we transformed and depicted the matrix P (Fig. 1C) via a topology graph (Fig. 1D), which describes the connectivity between the state (“spatial location”) and the topological representation of the environment28,29. The topology graph is in arbitrary unit (a.u.): each note represents a state or virtual location and the strength between two nodes indicates the pairwise connectivity (Pij + Pji, with dark color representing high strength). We also assessed the distribution of connectivity strengths and associated statistics (Fig. 1E). A detailed examination of the inferred 49 × 49 matrix P showed that the majority of nodes had more than one pair of significant nonzero connectivity. For instance, if we used a conservative high connectivity strength threshold 0.2–60% percentile of the empirical distribution, then 44/49 nodes had at least two connected nodes, whereas nearly half (24/49) of nodes had between 3 and 5 connected nodes. Combining the quantitative assessment and qualitative visualization, we reached the interpretation that the topology graph in Fig. 1D resembles a two-dimensional grid; its shape was invariant to the permutation of states in P. Although the exact values of P might be quantitatively different in random Monte Carlo simulations, the derived two-dimensional topology graphs were qualitatively similar with respect to with varying T0 configurations (data not shown) and varying subsets of neurons29.

Impact of the number of cells

Compared to awake experiences, firing rates of hippocampal neurons during post-run sleep episodes were reduced but highly correlated15,25. However, the participation of the active hippocampal cells during sleep can be highly variable. More importantly, only a small subset of pyramidal neurons are active during individual SPW-ripple events15. To simulate such conditions, we used a fixed value of T0 = 10 bpe and randomly sampled a subset of cells from the neuronal population (ρ = 30–100%, with a minimum of 10 cells being active); only those selected neurons were used in subsequent decoding analyses.

We found that the decoding error monotonically decreased as the increasing fraction of active neurons (Fig. 2A,B; see also the evolution of error distribution in Supplementary Fig. 6). When the number of cells fell below a certain percentage (~50%), DecodewoRF outperformed DecodewRF, yet the exact statistics varied between the two tested datasets. The slope of error curve in DecodewoRF was flatter, consistent with our previous finding that the topology-based coding may be more robust for spatial representation29. This is possibly because the DecodewoRF does not require a precise receptive field representation; in contrast, DecodewRF method is more dependent on the neurons that have a well-described place receptive field representation; when the receptive field characterization is less accurate due to the finite sampling issue, it may produce a large error.

Figure 2
figure 2

Comparison of median decoding error between DecodewRF and DecodewoRF.

(A,B) Decoding error decreased with increasing numbers of cells in neuronal population. (C) Decoding error changed with respect to varying fractions of active neurons (under thinning) and (D) changed with respect to varying temporal bin size. Error bar denotes SEM (n = 50 Monte Carlo runs).

To test specific relationship between population representation and the cell physiological properties, we evenly split the neurons of Dataset 1 into two groups (upper vs. lower 50% percentile) according to their normalized spatial-information rates (Methods, Fig. 3A). Under the same configuration (T0 = 10, ρ = 50%), we compared the decoding accuracy of two population methods based on Monte Carlo simulations. The result (Fig. 3B) indicated that the information-rich neuron subpopulation had a greater influence on representation or decoding accuracy (P < 0.001, Wilcoxon signed rank test).

Figure 3
figure 3

Specificity of hippocampal neurons in cell population on the decoding error.

(A) Cumulative distribution of normalized spatial information rate (bits/spike) of 49 hippocampal neurons (Dataset 1). (B) Comparison of median decoding error by using spatial-information high vs. low subpopulations (T0 = 10 bpe, ρ = 0.5; error bar denotes SEM, n = 50 Monte Carlo runs).

In experimental sleep recordings, different subsets of neurons often fire at individual, isolated sleep episodes. To simulate this situation, we introduced additional level of randomness by assuming that distinct neuronal subpopulations (but with identical ratio ρ) are randomly active at individual epochs—this was in direct contrast to the previous assumption that the same subpopulations were engaged in all episodes. As a demonstration, we fixed T0 = 100 and applied DecodewoRF to Dataset 1. As expected, the decoding accuracy further degraded: for ρ = 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, the median errors were 9.07 ± 0.18, 10.02 ± 0.14, 10.25 ± 0.13, 11.51 ± 0.12, 11.99 ± 0.12, 12.53 ± 0.16 cm (n = 50 Monte Carlo runs), respectively. The error was not only greater than the error in the case of ρ = 1 (8.51 ± 0.18 cm, Fig. 1B), but also greater than the error with fixed subpopulations (T0 = 100 vs. T0 = 100* bpe, Table 2).

Table 2 Comparison of median decoding error (mean ± SEM, n = 50 Monte Carlo runs) between DecodewRF and DecodewoRF for Dataset 1.

Impact of bin size, spike rate and conjunctive factors

During different sleep stages, hippocampal neurons fire at different timescales15,23. To examine the influence of temporal bin size, we fixed ρ = 1 and T0 = 10 bpe and varied bin size Δ (20, 50, 100, 150, 200, 250 ms) to repeat the decoding analysis. Note that a decreasing Δ would increase the number of discrete bins T. For DecodewRF, the decoding accuracy reduced with a decreasing Δ. This might be due to violation of Poisson assumption while using a small bin size or due to the presence of theta sequences (i.e., the decoded position may be systematically ahead of actual animal’s position). In contrast, the decoding performance of DecodewoRF (blue curve, Fig. 2D) was relatively stable for various Δ, possibly because its Bayesian inference procedure is less sensitive to the Poisson firing assumption30.

Next, we thinned the spike train data by downsampling such that there was no more than one spike per time bin, which was aimed to simulate the sparse spiking in a finer timescale during sleep. As a result, the instantaneous firing rate reduced to 25–50% of the original rate. We found that the spike thinning procedure further degraded the decoding performance and the decoding accuracy also dropped with decreasing number of neurons (Fig. 2C vs. Fig. 2A).

Lastly, we jointly varied two parameters (such as T0 and ρ, or T0 and Δ) and repeated the decoding analysis. As shown in Table 2, we obtained consistent findings as in Fig. 2 (see also Supplementary Fig. 7): (1) For fixed Δ and ρ, there was a decreasing trend in decoding error with increasing T0, but the performance was relatively stable; (2) Regardless of T0, decoding error decreased with increasing ρ; (3) For fixed T0, there was a decreasing trend in decoding error with increasing Δ.

Impact of non-place cells

Next, we investigated if and how the presence of non-place cells would affect the decoding accuracy. A non-place cell implies that the putative pyramidal cell is not significantly modulated by spatial location, or its spatial tuning curve is flat. A high ratio of non-place cells implies a low signal-to-noise ratio (SNR) for fixed number of cell population. To simulate such a condition, we randomly selected a small subset of place cells (Dataset 1, many of which have overlapping place fields, see Supplementary Fig. 3) and evenly distributed spikes in time proportional to animal’s space occupancy (such that their average firing rates remained unchanged). Under the same configuration (T0 = 100 bpe), we found the decoding error of two methods increased with growing number of non-place cells (Supplementary Fig. 8). At first, DecodewoRF was slightly worse than DecodewRF, but the gap gradually reduced with decreasing SNR (Supplementary Fig. 8A, red vs. blue solid lines); and DecodewoRF outperformed DecodewRF significantly (P = 6 × 10−5, Wilcoxon signed rank test) in the worst scenario. This result confirmed the robustness of DecodewoRF under a low SNR.

Significance testing via randomly shuffled data

We tested our population decoding methods by comparing their estimate statistics derived from experimental data with those derived from randomly shuffled data (Supplementary Fig. 5B). Specifically, we used the hippocampal ensemble spike activity collected during animal’s run behavior (speed >15 cm s−1) in a circular track environment (Dataset 3). Upon completion of unsupervised learning (DecodewoRF), we recovered the state trajectory, which correlated with the animal’s run trajectory (Pearson’s correlation  [0.73, 0.79] derived from 10 Monte Carlo runs, P = 1.5 × 10−15, Supplementary Fig. 9A). In addition, we obtained the state transition matrix and state field matrix, which were both qualitatively similar to the behavior-derived ground truth (Supplementary Fig. 9B,C). The average maximum a posteriori (MAP) probability score derived from DecodewoRF was 0.8814 and the weighted correlation was 0.8848. These statistics were also similar to those from DecodewRF, except that DecodewRF required receptive fields or behavioral measure a priori.

We further constructed 1000 shuffled datasets. Each randomly shuffled dataset was subject to both temporal bin and cell identity shuffles (Methods). The Monte Carlo weighted correlation R and average MAP probability scores derived from the shuffled data than those derived from the raw data were significantly lower (Monte Carlo P < 10−7, Supplementary Fig. 10). These results demonstrated that, in the absence of behavioral measures (therefore no decoding error can be computed), theses metrics can be used as quantitative measures to assess the quality of reconstructed event for detection purpose. In the remaining analyses, we used R and its associated Z-score (or equivalent Monte Carlo P-value) for assessment.

To compare the detection reliability and specificity between DecodewRF and DecodewoRF, we selected a random segment of run trajectory (T0 = 20 bpe, Fig. 4A) and systematically manipulated the ensemble spike activity during that time interval as follows: (1) We randomly removed 20–80% of cells from the population (i.e., ρ = 0.2–0.8). (2) Using all active cells (ρ = 1), we randomly removed spikes in selected temporal bins from each cell, with the number of bins ranging from 2 to 10 (i.e., 10–50% of T0)—which would sparsify and remove certain temporal structures in the ensemble spikes. We simulated each condition with 100 Monte Carlo runs and each run produced an independent test set. We applied DecodewRF and DecodewoRF to those test sets and computed their R and Z-scores. The result comparison is shown in Fig. 4 (see also Supplementary Fig. 11 for scatterplot comparison). As the number of active cells dropped, the detection power of both methods decreased accordingly (Fig. 4B). In this specific example, the |R| value was below 0.5 when ρ = 0.2 (i.e., 10 cells). In terms of the Z-score, majority of simulated events were non-significant when ρ < 0.8. Removing spikes also degraded the detection power (Fig. 3C; see also Supplementary Fig. 12). Together, these results suggest that the detection power of DecodewoRF was more favorable in those tested conditions.

Figure 4
figure 4

Comparison of detection reliability between DecodewRF and DecodewoRF.

(A) Segment of a spike count matrix with 20 temporal bins. (B,C) Weighted correlation (Left) and Z-score (Right) for varying active cell ratio ρ (B) and for removing spikes across different number of bins (C). Error bar denotes SEM (n = 50 Monte Carlo runs).

Analysis of ripple-associated spike data in quiet wakefulness

We also tested our methods on ripple-associated hippocampal ensemble spike data during quiet wakefulness (QW)—the awake brain state involved in memory replay similar to SWS16,17,18,19,23,32,33. In a long recording (Datasets 4a and 4b), hippocampal ensemble spikes were collected in the 4-hr pre-run and 4-hr post-run periods (inside a rest box in a familiar environment), separated by 40-min run period on a circular track in a novel setting (see Supplementary Fig. 13 for brain state classification). From Dataset 4b, we identified off-the-track candidate events based on hippocampal local field potential (LFP) and multi-unit activity (Methods) and further excluded the epochs with low fraction of active cells (<10%). See Table 3 for the summary statistics of candidate events in different states.

Table 3 Summary statistics of candidate events (Dataset 4b).

The ratio of active cells across all selected epochs was ρ = 0.181 ± 0.002 (maximum 0.68, median 0.16). We binned each epoch with Δ = 20 ms, resulting in T0 = 11.5 ± 0.2 bpe (maximum 39, median 10). We then reconstructed the spatial (or state) trajectory based on the place field (or state field) λc(S) (where the state field was inferred by DecodewoRF from the run-associated ensemble spikes alone). For each epoch, we computed the weighted correlation R and its associated Z-score and compared them with those obtained from randomly shuffled data. Figure 5 shows some examples of detected significant replays during post-QW epochs. Qualitative and quantitative assessment of those replay events indicated diverse (forward vs. reverse) spatiotemporal structures.

Figure 5
figure 5

QW-associated ensemble spike data analysis.

(A) Example of spike rasters and the associated decoded spatial trajectories in quiet wakefulness. The number at the top of each panel indicates the absolute weighted correlation |R|. (B) Examples of detected significant replays. X-axis represents time bin (bin size Δ = 20 ms).

Analysis of SWS-associated spike data

At last, we applied our population decoding methods to experimental SWS-associated hippocampal ensemble spike activity (Dataset 4b). The candidate events with >10% active cells were selected for analysis (Table 3) and each event was treated as an independent epoch.

Specifically, there was no difference in ρ between pre- and post-SWS (P = 0.31, rank-sum test; pre-SWS: ρ = 0.175 ± 0.003, maximum 0.45, median 0.16; post-SWS: ρ = 0.178 ± 0.003, maximum 0.53, median 0.16). With Δ = 20 ms, the number of bins per epoch was slightly longer in pre-SWS than in post-SWS epochs for Dataset 4b (P = 0.006, rank-sum test; pre-SWS: T0 = 12.7 ± 0.3 bpe; post-SWS: T0 = 11.9 ± 0.2 bpe). Hippocampal neurons’ mean firing rate remained stable between pre-SWS and wake as well as between post-SWS and wake (Supplementary Fig. 14), although the mean firing rate in wake was significantly higher (Wilcoxon signed rank test, P < 1.3 × 10−5). To examine significant pre- and post-SWS reactivation events, we used the inferred λc(S) during RUN to estimate the state trajectory and posterior probability scores of candidate events during respective pre- and post-SWS periods. Some detected reactivation examples are shown in Fig. 6A,B, respectively. In comparison, the quality of detected post-SWS replay events was qualitatively better in terms of trajectory continuity than that of detected pre-SWS events. We identified statistically significant events based on their computed R and Z-score statistics (Table 3). The absolute number and the ratio of significant events increased from pre-SWS to post-SWS. In addition, the Z-score among the significant events was greater in post-SWS (P < 0.01, rank-sum test). These results suggested that the neuronal ensemble patterns shared more similar structures between post-SWS and RUN than between pre-SWS and RUN—a finding consistent with the pairwise correlation method (ref. 4, see Supplementary Fig. 15) and another independent investigation31.

Figure 6
figure 6

SWS-associated ensemble spike data analysis.

(A,B) Examples of detected significant pre-SWS (A) and post-SWS (B) reactivation events. The number at the top of each panel indicates |R|. X-axis represents time bin (bin size Δ = 20 ms). (C,D) Testing predictability of RUN and SWS(1) data for SWS(2): scatterplot comparison of weighted correlation (C) and Z-score (D) between RUN → SWS(2) and SWS(1) → SWS(2). Lower left corner marked by dashed line indicates the non-significance zone.

We further examined the nonstationarity of sleep epochs by comparing the results derived from the first and second-half of post-SWS candidate events (i.e., SWS(1) and SWS(2) have the same epoch number that had no less than 10% active cells, defined by in Table 3). For DecodewoRF, we found that the T0, ρ and R statistics were similar between SWS(1) and SWS(2), but the numbers that aim to assess the significance of detected events ( and in Table 3) all decreased in SWS(2). This could be due to the fact that memory reactivation was more frequent in SWS(1) than in SWS(2), or the representation power decreased in SWS(2). To test the predictive power of SWS(1) to SWS(2), we applied DecodewoRF to SWS(1) and inferred the SWS-state field λSWS (which was distinct from estimated from spikes alone in run behavior). We then used to assess the R statistic for SWS(2) and compared that with the R statistic obtained from . A scatterplot comparison (Fig. 6C,D) showed a decrease trend in |R| (P = 10−15) and Z-score (P = 1.1 × 10−4, both Wilcoxon signed rank test) from using to using , suggesting a reduction of predictive power in SWS(1) → SWS(2).


Interrogating the temporal structure and content of sleep-associated hippocampal ensemble spikes can reveal important mechanisms of hippocampal sequence generation34,35,36 or diverse contributing roles of hippocampal neurons in plasticity31. However, analysis of such spike data has posed a great challenge. In this study, we applied two population decoding methods (DecodewRF and DecodewoRF) to rat hippocampal ensemble spikes recorded in different brain states, aiming to infer the animal’s actual or virtual spatial location based on their spatiotemporal firing patterns. In terms of representation and detection power, population decoding methods are more powerful than the conventional correlation or sequence methods for discovering inherent structures of the ensemble spike data. Moreover, since the latent state corresponds to an abstract or virtual behavioral correlate in DecodewoRF, detecting statistical significance of temporal sequences is not restricted by the line fitting procedure23, which may become an issue for DecodewRF in the presence of cursive trajectories (e.g., in a two-dimensional environment) or in the presence of discontinuity in spatial trajectory (see an example in Supplementary Fig. 16). Moreover, our Bayesian inference procedure automatically identifies the model order in DecodewoRF to allow optimal choice of spatial resolution given observed ensemble spikes. From the analyses of both synthetic and experimental data, we found that the representation and detection power of both population decoding methods were strongly dependent on the number of active place cells. Since place cells did not contributed evenly in representation (Fig. 3 and Supplementary Fig. 6), fast-firing neurons did not always contain the most spatial information (bits/spike). In fact, recent findings suggested that slow-firing neurons may contribute more to neuronal sequences from pre to post-sleep31. Considering the low fraction of active hippocampal cells in sleep and the lognormal distributed phenomenon25, a large number of recorded place cells are necessary to secure the statistical power for sleep data analysis.

Population decoding methods have been proven useful in studying information transmission and sensory coding of neural systems37,38. Here, our model-based decoding approach offers a statistical framework to assess the content of sleep-associated hippocampal ensemble spikes, which may reveal important mechanism insights on hippocampal neurons in memory consolidation. Similar to other reports18,31, we found that the reactivated spatial trajectories or sequences in hippocampal ensemble representations were better correlated and more sharply defined in post-SWS than in pre-SWS. Nevertheless, several statistical questions still remain unanswered. One puzzle is how can we extract significant non-spatial information encoded in sleep? Another pressing issue is to design statistical methods that can adapt to specific temporal (e.g., inhomogeneous, nonstationary and heteroscedastic) structure of ensemble spike data. Thus far, we have used a uniform temporal bin size throughout SWS, yet finding the optimal timescale is critical for decoding analyses. Our current study has focused on hippocampal ripple-associated ensemble spike activity and ignored other spike activities outside of ripples. Analyzing continuous sleep-state spike activity would be the next goal. Notably, hippocampal and cortical neurons operate at a different timescale in REM sleep from SWS. The question of interpreting sparse and sporadic REM-associated hippocampal spike activity remains unresolved. A recent report has revealed similar geometric structure in neural correlations of hippocampal neurons between active navigation and REM sleep39. It would be interesting to test the population decoding methods on such independent recordings. In addition, these methods can be tested to evaluate brain state transition.

In principle, our unsupervised population decoding framework can be applied to hippocampal-cortical or thalamocortical ensemble spikes in sleep10,40,41,42. Joint investigation of spatiotemporal sequences in these circuits during sleep replay events are crucial to infer the communications and information transfer between these circuits during memory consolidation. Given a large neuronal ensemble, the DecodewoRF method is appealing since it requires no explicit measure of behavior or receptive fields, where the latent states may represent non-spatial features of experiences or distinct behavioral patterns that cannot be measured directly. Ultimately, it is critical to discover nonlinear interactions and extract spatiotemporal organization among neuronal ensembles, integration of such principles and data-driven neuronal models will be the key to revealing intrinsic structures of neuronal ensemble spikes.


Animal behavior and neurophysiological recordings

Long-Evans rats were freely foraging in familiar spatial environments for a period of 30–45 minutes (Datasets 1–3). In Datasets 4a and 4b, rats were first put in a sleep box of a familiar environment for 4 hours and then moved to a circular track (novel environment) for running about 45 minutes and then put back to the sleep box for another 4 hours (ref. 31). All procedures were approved by the MIT and NYU Institutional Animal Care and Use Committee and carried out in accordance with the approved guidelines.

Custom microelectrode drive (Datasets 1–3) or silicon probe arrays (Datasets 4a and 4b) were implanted unilaterally or bilaterally in the animal’s dorsal hippocampal CA1 area. Spikes were acquired with a sampling rate of 31.25 kHz and filter settings of 300 Hz–6 kHz. Two infrared diodes alternating at 60 Hz were attached to the drive of each animal for position tracking. We used a custom manual clustering program for spike sorting to obtain well-isolated single units. Details are referred to previous publications23,31. Putative interneurons were identified based on the spike waveform width and average mean firing rate. In addition, all putative pyramidal neurons selected for analysis had peak firing rate >1 Hz.

Bayesian decoding

The Bayesian decoding algorithms is formulated within a state-space model framework21,22,28,29,30. Let St represent the animal’s spatial position label at discrete time t and let yt represent the observed neuronal population spike count between ((t − 1)Δ, tΔ], where Δ is the temporal bin size. The state variable St is assumed to follow a first-order Markovian dynamics and characterized by p(St|St − 1). The goal of Bayesian decoding is to infer the posterior probability p(St|y1:t) given all the spike history up to time t. Here we assumed that conditional on the state St, the population firing of C hippocampal place cells follows a Poisson likelihood model

where denotes the spike count from the c-th neuron at the τ-th temporal bin. In light of Bayes’ rule, the posterior distribution of the state St is given by

where p(St|St−1) denotes the temporal prior and p(y1:t) is a normalizing constant.

For decoding analysis, we used two population decoding methods. In the first method (DecodewRF, Supplementary Fig. 2A), the animal’s spatial position was measured during run behavior, which was further used to estimate neuronal receptive fields λc(S) (note: S is continuous-valued and can be finite or infinite, with proper dimensionality depending on the spatial environment). Hippocampal place fields were estimated using a spatial bin size of 10 cm for one-dimensional tracks and bin size of 5 × 5 cm2 or 15 × 15 cm2 for two-dimensional space and further smoothed using a Gaussian template (5 × 1 for one-dimensional or 3 × 3 for two-dimensional environment) with a half SD. This method consists of both encoding and decoding phases, where the encoding phase is supervised.

In the second method (DecodewoRF, Supplementary Fig. 2B), the animal’s behavioral measures are assumed inaccessible, therefore no place fields can be estimated from the behavioral data. The second method only consists of decoding phase and it is purely unsupervised. In this case, St represents a discrete-state label for the spatial position and it can be either finite or infinite depending on the statistical assumption, spatial resolution and the size of data. In this special case, the state space model is a hidden Markov model (HMM); trajectories across spatial locations (“states”) were associated with consistent hippocampal ensemble spiking patterns, which were characterized by a stationary state transition matrix defining p(St|St−1) (e.g., Fig. 1C). The observed spike count data was defined by a Poisson probability distribution p(yt|St) in equation (1). Unlike DecodewRF, the state of DecodewoRF was subject to permutation ambiguity due to the lack of behavior measure. The goal of inference is to estimate the maximum a posteriori (MAP) state sequences S1:T and the unknown state transition matrix and rate parameters λc(S) with respect to the state space S = {Si} (where Si {1, 2, …, m} are categorical variables). See refs 28, 29, 30 for details of model description and inference procedure. Briefly, first, we applied a Bayesian nonparametric version of the HMM: hierarchical Dirichlet process (HDP)-HMM, combined with advanced Markov chain Monte Carlo (MCMC) inference methods30. The number of latent states, m, was automatically inferred from the MCMC inference procedure (Supplementary Fig. 17A). Second, we constructed a “state space map” between the discrete state and the spatial position (see Supplementary Fig. 17B for illustrations). For one-dimensional environment, the ideal state space map shall have a one-to-one mapping. Third, to visualize the inferred state transition matrix (Fig. 1C), we applied a force-based algorithm to derive a scale-invariant topology graph that defines the connectivity between different states (nodes) (Fig. 1D), which provided intuitive result interpretation and qualitative assessment.

In the testing phase, two population decoding methods were operated in a similar way, except with different λc(S) (one constructed from behavior and the other estimated from spikes alone). We applied these two methods to reconstruct the spatial position or state S at each time bin and computed the average MAP probability score from multiple bins.

Information rate of hippocampal neurons

Information-theoretic measures have been used to characterize the information of hippocampal neurons43. We define the spatial information rate of the c-th hippocampal neuron as follows

where λc(S) denotes the mean firing rate at spatial location S and λc = ∫λc(S)p(S)dS denotes the total average firing rate (spikes/s). The unit of Ic is measured by bits/s. To account for the total firing rate effect, we compute the normalized information rate, measured by bits/spike.

Statistical assessment

For DecodewRF, we computed the median decoding error between the estimated animal’s position and the actual position. For DecodewoRF, the animal’s actual position was solely used for result assessment. Based on the state space map, we estimated animals spatial trajectories and computed the median decoding error29,30.

Statistically significant reactivation events were determined by three established criteria17,18: (1) The absolute “weighted correlation” R (which measures the strength of correlation between the changes in probability values across time and spatial position) greater than 0.5. (2) The time length is greater than five temporal bins (i.e., 100 ms for QW or SWS epochs). In addition, the MAP probability score equal or less than the threshold (5/total number of position bins; below which is just a chance level) is also regarded non-significant. In addition, we generated shuffled candidate events from each pre-identified candidate event and computed the Rshuffle from randomly shuffled population spike data. Two types of shuffling operations were considered: temporal shuffling and cell shuffling. Algebraically, the spike count matrix was subject to both row (temporal) and column (cell) shuffle operations. A total of 1000 shuffled samples were constructed. From the raw and shuffled statistics, we computed the Z-score for R as follows31: Z = (R − mean of Rshuffle)/(SD of Rshuffle). (3) The Z-score of R is greater than 1.65 (equivalent to one-side P-value 0.05 assuming the null distribution is normally distributed). A high positive Z-score indicates that the raw data statistic is much greater than those obtained by chance (null hypothesis) and therefore is highly significant in a statistical sense. If the null distribution (of shuffle statistics) is non-normally distributed (Shapiro-Wilk test or Anderson-Darling test), we derived the Monte Carlo P-values from the sample distribution.

Identification of hippocampal ripple candidate events during sleep and quiet wakefulness

During sleep, we focused on SWS epochs, which were primarily determined by the low EMG amplitude and high delta/theta power ratio in EEG activity (REM sleep is associated with low EMG, low delta/theta power ratio and high theta power). For screening the candidate events, we used hippocampal LFP ripple band (150–300 Hz) power combined with hippocampal multi-unit activity (threshold > mean + 3SD). We also imposed a minimum cell activation criterion (>6 or 10% of cell population, whichever is greater). Similar LFP and multi-unit activity criteria were also applied to QW periods, when the animal was in an immobile wake state (speed <2 cm s−1).

Additional Information

How to cite this article: Chen, Z. et al. Uncovering representations of sleep-associated hippocampal ensemble spike activity. Sci. Rep. 6, 32193; doi: 10.1038/srep32193 (2016).