Introduction

How does the brain allow us to discriminate between similar events in our past? This question is a central challenge in the neurobiology of memory and remains elusive. To prevent confusion between memories that share similar features, the brain needs to store distinct activity patterns to represent distinct memories. In the influential Hebb-Marr framework of episodic memory1,2, representations are stored in area CA3 of hippocampus, an auto-associative network where plastic recurrent excitatory connections facilitate recall of stored patterns in response to partial cues3,4. However, strong recurrent excitation severely limits the number of patterns that can be stored without overlap4,5. Such overlap would lead, when a partial cue common to several patterns is presented, to the reactivation of many patterns and thus to confusion or confabulation. To avoid these interferences, the Hebb-Marr framework proposes that redundancy between input patterns is reduced before they are stored. This process of transforming similar input patterns into less similar output patterns is termed pattern separation5,6.

Theoretical models suggest that the dentate gyrus (DG) performs pattern separation of cortical inputs before sending its differentiated outputs to CA31,2,4. Indeed, DG is ideally located to do this, receiving signals via the major projection from entorhinal cortex (EC), the perforant path (PP), and sending signals to CA3 via the axons of granule cells (GCs)7. In addition, behavioral studies have shown that DG lesions impair mnemonic discrimination8,9,10,11,12 and several experimental reports have shown that similar environments or events are represented differently in the DG13,14,15,16,17. However, this separation of DG representations could be inherited from upstream structures (e.g. EC) and simply reported by DG. Therefore, a rigorous demonstration that pattern separation is performed by DG requires simultaneous knowledge of its inputs and outputs6. Some electrophysiological studies suggest that EC spatial representations are on average more correlated than in DG13,14,18,19, but the recorded EC neurons were unlikely to contact the recorded DG neurons, and were not recorded at the same time: a direct test of whether DG itself performs pattern separation on EC inputs is thus still lacking.

Another difficulty in studying pattern separation is in defining the nature of “activity patterns”. Previous studies have focused on spatial patterns of “active neurons”, with little reference to the dynamics of neural activity. For example, computational models predict that DG separates overlapping populations of active EC neurons into less overlapping populations of active GCs5,20,21,22,23. Immediate-early genes expression studies have confirmed that distinct events drive plasticity in different populations of GCs15,24 and that overlap in these representations causes mnemonic interference25. In contrast, in vivo single-unit recordings in the DG found that similar contexts are represented by the same population of active neurons, but differences are encoded by different spatially tuned firing patterns13,14,17.

These conflicting results show that pattern separation could correspond to different computations depending on the type of patterns investigated, and that multiple forms of pattern separation could in theory be implemented by the DG6. For example, because in vivo recordings suggest that the same neurons can be used to code different environments13,14,17, it is possible that pattern separation is performed at the level of single GCs, each disambiguating the activity patterns that it receives. Such disambiguation could be done by changing firing rates, or alternatively, by changing spike timing. Previous experimental investigations of pattern separation in DG examined population vectors of place fields averaged over minutes13,14,26, but place cells also carry information at shorter timescales27,28,29,30. So far, pattern separation has not been well characterized on the scale of milliseconds, and never where patterns are explicitly afferent and efferent trains of action potentials.

Thus, whether the DG network per se reduces the overlap between similar inputs, and how it performs this computation remains a mystery, especially at short timescales. Here, we set out to test the hypothesis that the DG performs pattern separation of cortical spiketrains at the level of single GCs. We designed a novel pattern separation assay in acute brain slices to take advantage of the experimental control afforded to slice electrophysiology. Complex input spiketrains of varying similarities were fed into the DG via its afferents, and the output of a single GC was simultaneously recorded, allowing the first direct measure of pattern separation (by comparing input similarity versus output similarity), on timescales relevant to neuronal encoding and synaptic plasticity. Finally, we explored whether other cell types in the DG and CA3 exhibited this form of pattern separation and investigated the role of neuronal noise and synaptic dynamics in supporting this computation.

Results

Temporal pattern separation in individual dentate granule cells

A direct test of pattern separation in single GCs requires knowledge of the similarity between input patterns arriving via the PP and comparison with the similarity between GC output patterns. Here, we define input and output patterns as rasters of spiketrains. The similarity between two spiketrains was assessed by computing their pairwise Pearson’s correlation coefficient (R) using a binning window τw of 10 ms (unless otherwise specified). We generated sets of Poisson input spiketrains (simulating trains of incoming cortical action potentials), with each set having an average correlation Rinput (Fig. 1a and Methods – Pattern separation experiments). We then recorded the spiking responses of GCs to these sets of input trains delivered to lateral PP fibers (Fig. 1b,c) (102 recording sets from 28 GCs in adolescent mice), allowing us to compute the average output correlation (Routput) (Fig. 2a,b).

Figure 1
figure 1

Pattern separation assay in acute brain slices at the single cell level. (a) Examples of input sets. Top: each input set comprises five different trains of electrical pulses. Bottom: correlation coefficient matrix for each input set, each square representing the correlation coefficient between two input trains measured with a binning window (τw) of 10 ms. Rinput is the average of coefficients, diagonal excluded. (b) Histology of the DG in a horizontal slice (Cresyl violet/Nissl staining; scale bar: 250 µm), overlaid with a schematic of the experimental setup: a theta pipette in the ML (input) is used to focally stimulate the PP while a responding GC (output) is recorded via whole-cell patch-clamp. (GCL: granule cell layer, H: hilus, ML: molecular layer, FS: fast-spiking interneuron. Solid lines represent dendrites and dashed lines axons). (c) Current-clamp recordings of the membrane potential of two different GCs in response to different input sets (Top: Rinput = 1; Bottom: Rinput = 0.76). Each input set (five input traces) is repeated ten times (only three repetitions are shown, with spikes truncated at 0 mV). In the bottom graph, input trains and their respective children output spiketrains have matching colors.

Figure 2
figure 2

Input spiketrains are decorrelated at the level of individual granule cells. (a) Example of a recording set (input set + output set): the raster plot shows one set of input spiketrains and the children output spiketrains recorded from one GC, reordered to display output subsets (i.e., the ten children coming from one parent input spiketrain) together and with the same color. (b) Corresponding 55 × 55 correlation coefficient matrix using a binning window (τw) of 10 ms. Each small square represents the correlation coefficient between two spiketrains. Routput is defined as the mean of correlations between individual output spiketrains driven by different input spiketrains, as outlined by the bold blue border, which excludes comparisons between outputs generated from the same parent input. (c) Data points, corresponding to 102 recording sets (28 GCs), are all below the identity line (dashed line). This means that Routput was lower than Rinput.for all recordings, thus demonstrating pattern separation. (d) Left: Effective decorrelation averaged over all recording sets as a function of Rinput. Although there is a significant decorrelation for all tested input sets (one-sample T-tests: the blue shade indicates the 95% confidence interval that average decorrelation is significantly above 0), they are effectively decorrelated to different magnitudes (one-way ANOVA, p < 0.0001). Right: Matrix of p-values from post-hoc Tukey-Kramer tests comparing effective decorrelation levels across all pairs of parent input sets. The asterisk and star correspond to the comparisons displayed in the left panel. This analysis shows that the decorrelation is significantly different (higher) for highly similar input spiketrains than for already dissimilar inputs. (e) When the effective decorrelation is normalized to the correlation of the input set, there is no significant difference between input sets (ANOVA, p = 0.19). In all graphs, τw = 10 ms. Means and SEM in black.

For every recording set, Routput was lower than the Rinput of the associated input set, indicating a decorrelation of the output spiketrains compared to their inputs (Fig. 2c). This was also the case in GCs recorded in slices from adult mice (35 recording sets from 14 neurons) (see final figure). The effective decorrelation, defined as the difference between Rinput and Routput, was statistically significant for every input set (Fig. 2d left), but was larger when input spiketrains were highly correlated (Fig. 2d). This is consistent with the role of DG in discriminating between similar memories more than already dissimilar ones9. Note, however, that the decorrelation normalized to Rinput is invariant: whatever the input set, the output trains were always decorrelated to ~70% of Rinput (Fig. 2e). Such invariance suggests that the same decorrelating mechanism is used on all input sets.

These results constitute the first direct experimental demonstration that single GCs, the output neurons of DG, exhibit temporal pattern separation. A comparison of the spiking patterns recorded in multiple GCs also shows that different GCs tend to process the same input spiketrain in different ways (Fig. S1). Such diversity could support pattern separation at the population level.

Temporal pattern separation in other cell types of the DG network

Any channel processing inputs and returning outputs, and thus any brain network, performs either pattern separation or pattern convergence to some degree6. Thus, GCs are unlikely to be the only neurons to exhibit temporal pattern separation of spiketrains. However, we would expect pattern separation to be at its greatest in GCs, at least among DG cells, because they are the output neurons of the DG and thus should provide the most separated patterns to CA3 before they are stored. To test this hypothesis, we performed the same pattern separation assay while recording from fast-spiking interneurons (FS, 20 recording sets) (Fig. 3a) or hilar mossy cells (HMC, 18 recording sets) (Fig. 3b).

Figure 3
figure 3

Different levels of temporal pattern separation in different DG cell types. (a1) Picture of a recorded FS filled with biocytin (black). In the case of simultaneous recordings, the recorded GCs were close to the FS, as depicted by the schematic in green. In different experiments, we recorded from putative hilar mossy cells (HMC, orange). Full lines represent dendrites, dashed lines axons. (a2-3) Example of a simultaneous whole-cell recording of a GC and a neighboring FS. (a2) Simultaneous membrane potential recordings (baseline at −60 mV) of a FS and a GC to the same set of current steps (−25, 100, 500 and 1000 pA). (a3) Simultaneous current-clamp recordings of the same FS and GC as in a2 in response to the five input traces of an input set with Rinput = 0.65. Simultaneous input and output trains have the same color. (a4) Routput versus Rinput for FSs and GCs. Data points correspond to recording sets: 20 for FS (red, 4 cells, 4 per input set), and 61 for GC (green, with a darker shade open circle when simultaneously recorded with a FS, 13 cells, 11–13 per input set). All GC recordings done at the same input correlations as FS recordings were used for an unpaired comparison showing that FS exhibit less spiketrain decorrelation than GCs: ANCOVA: p < 0.0001, 95% confidence bounds around the linear fits shown as shaded areas; two-way ANOVA: input sets: p = 0.0016, cell-types: p < 0.0001, interaction: p = 0.72. Post-hoc T-tests with Bonferroni corrections for 5 comparison groups: all p < 0.05 (for decreasing Rinput, p = 0.0307, 0.0181, 0.0007, 0.0001, 0.0122). (a5) Effective decorrelation (Rinput – Routput) for FS and GC. Shaded areas represent the 95% confidence interval of a bootstrap test comparing the mean decorrelation of both celltypes: GCs exhibit significantly more pattern separation than FS for all input sets. (a4–5) Note that when comparing only the simultaneous GC and FS recordings, we found a similarly significant difference between celltypes. (b1) Membrane potential of a hilar neuron in response to current steps (−100, 0, 100, 400 pA; baseline at −70 mV), showing a spontaneous barrage of EPSPs, regular spiking, and a lack of large after-hyperpolarization, all typical features of HMCs (Larimer & Strowbridge, 2008). (b2) Current-clamp recordings of same HMC in response to a set of five input trains (Rinput = 0.48, τw = 10 ms). HMCs fire occasional bursts of spikes (marked by asterisks) in response to a single input, which was not seen in GCs. (b3) Routput versus Rinput for HMCs and GCs. Data points correspond to recording sets: 18 for HMC (orange, 11 cells, 5–7 per input set), and 22 for GC (green, 11 cells, 4–10 per input set). An unpaired comparison suggests that HMCs and GCs show only slight differences in pattern separation measured at τw = 10 ms: ANCOVA: p = 0.15, 95% confidence bounds around the linear fits shown as shaded areas; two-way ANOVA: input sets: p = 0.0004, cell-types: p = 0.074, interaction: p = 0.57. Post-hoc T-tests with Bonferroni corrections for three comparison groups (for decreasing Rinput): p = 1, 0.05, 0.21.

At the 10 ms timescale, the distributions of average (Rinput, Routput) were significantly different between FSs and GCs, with the Routput of simultaneously recorded GCs always lower than their corresponding FS (Fig. 3a4): FSs exhibit lower levels of decorrelation than GCs (Fig. 3a5). For HMCs, although their current-clamp responses and spiking behavior looked different from GCs (Fig. 3b2), the distribution of average (Rinput, Routput) appeared only slightly different from GCs with this analysis (Fig. 3b3). Differences were more obvious with a finer-grained analysis (detailed below in Results – Temporal pattern separation at different timescales across celltypes).

Temporal pattern separation in CA3 pyramidal cells

Our FS recordings have shown that not all neurons necessarily exhibit as high levels of temporal pattern separation as GCs. We thus asked whether this form of pattern separation is specific to the DG output, or whether neurons in other hippocampal regions can perform temporal pattern separation. To answer this question, we adapted our pattern separation assay to CA3 pyramidal cells (PCs) (Methods – Pattern separation experiments) and recorded their output spiketrains in response to direct stimulation of the GCL, i.e. in response to its input from the DG (Fig. 4a). Due to strong feedforward inhibition, CA3 PCs do not generally spike in response to external stimulation of afferents in slices, unless inhibition is totally blocked31,32,33. By using 30 Hz Poisson input sets (Fig. 4b) and adding 100 nM of gabazine to the bath, which only slightly decreases the amplitude of GABA-A-mediated IPSCs (see Methods), we managed to record for the first time the spiking output of CA3 PCs in response to complex input spiketrains while preserving some inhibition in the network (Fig. 4c). PCs fired during periods of high input frequency, probably due to depression of local inhibitory transmission31,34 combined with facilitation at the mossy fiber-CA3 PCs synapses35, and as a result action potentials often appeared more clustered than in control GC recordings (Fig. 4c). Despite this spiking pattern, we found that, at the 10 ms timescale, CA3 PCs average (Rinput, Routput) distribution was slightly but significantly lower than for GCs recorded in similar pharmacological conditions (Fig. 4d). This indicates that CA3 PCs are different from GCs in respect to temporal pattern separation, but that, surprisingly, their output spiketrains are even more decorrelated. Temporal pattern separation is thus not specific to the DG.

Figure 4
figure 4

CA3 pyramidal cells exhibit more temporal pattern separation than GCs. (a) Schematic of the experimental setup: a theta pipette in the granule cell layer (GCL) is used to stimulate GCs and their mossy fibers (dashed line) to evoke action potentials in CA3 pyramidal cells recorded via whole-cell patch-clamp. To limit feedforward inhibition and allow pyramidal cells to spike, experiments were performed under partial block of inhibition (100 nM of gabazine). Control experiments were performed in GCs under the same pharmacological conditions but with OML stimulations. (b) Two input sets of 30 Hz Poisson trains were used. Top: rasters of the five spiketrains of each set. Bottom: correlation coefficient matrix for each input set (τw = 10 ms). (c) Example of current-clamp recordings from a pyramidal cell and a GC from the same animal. Left: Membrane potential responses to current steps (−100 pA, 100 pA, 350 pA). Scale: 100 ms (horizontal) and 50 mV (vertical). Right: example of 5 output responses (sweeps 11–15) to an input set (Rinput = 0.76). (d) Routput versus Rinput for CA3 pyramidal cells and GCs. Data points correspond to recording sets: 15 for CA3 (black, 14 cells, 6–9 recordings per input set), and 22 for GC (green, 13 cells, 11 per input set). The CA3 distribution is lower and significantly different from the GC distribution: ANCOVA: p < 0.01, 95% confidence bounds around the linear fits shown as shaded areas; two-way ANOVA: input sets: p < 0.0001, cell-types: p = 0.0036, interaction: p = 0.24. Post-hoc T-tests with Bonferroni corrections for two comparison groups: p = 0.032 for Rinput = 0.76, p = 0.1 for Rinput = 0.21.

Temporal pattern separation at different timescales across celltypes

Because the timescales meaningful for the brain remain uncertain, it is important to assess the separation of spiketrains at different timescales. We therefore binned spiketrains using a range of τw from 5 ms to 250 ms and performed a finer grained analysis using pairwise Rinput and associated pairwise Routput instead of the average across the ten pairs of input trains (Fig. 5a). We discovered that pattern separation levels can dramatically change as a function of τw. In GCs, the larger the timescale the less they exhibit decorrelation of their input spiketrains, especially at high Rinput (Fig. 5b). Nonetheless, GCs still exhibit relatively high levels of pattern separation of highly similar input spiketrains, even at long timescales (Fig. 5b).

Figure 5
figure 5

Differences in temporal pattern separation between hippocampal celltypes depend on the timescale. (a) Pairwise analysis. Instead of averaging across all five input trains of a set (here Rinput = 0.76 at τw = 10 ms, same matrix as in Fig. 2b), we average children output coefficients corresponding to a single pair of input trains (identified by color-coded squares.). This finer analysis is necessary at τw higher than 20 ms because the pairwise input correlation coefficients are not well constrained around their mean anymore (see right panel). Middle and Right: mean across cells, following the same color-code as displayed in the matrix (Left). (b) Pairwise Routput vs pairwise Rinput for GCs, measured with different binning windows τw. Only the means across cells are displayed, but the regression line was fitted on the full distribution of data points. The larger τw, the less GCs exhibit pattern separation. (c) Top: Pairwise analysis on FS recordings (same as in b). Bottom: mean ± SEM and regression lines for τw = 100 ms: FS and GC distributions are still different, especially at high input similarity (ANCOVA: p < 0.0001 for τw = 10 up to 250 ms). Lower right inset: effect size as a function of the timescale τw (up to 250 ms): Mean ± SEM (red) and maximum (black dots) of the absolute difference between FSs and GCs Routput mean values for all pairwise Rinput. Note a decreasing effect of larger τw on the difference between FSs and GCs in terms of pattern separation. (d) Top: Pairwise analysis on HMC recordings (same color code as in b). Bottom: At τw = 250 ms, HMC and GC distributions are different (ANCOVA: p < 0.0001 for τw = 10 up to 250 ms), especially at lower input similarity: notice the points above the identity line, showing pattern convergence for HMCs in contrast to GCs. Under this pairwise analysis (in contrast to Fig. 3b), HMCs and GCs are significantly different at τw = 10 ms (comparison graph not shown, see inset for effect size). However, the inset (see c for legends) shows an increasing effect of τw on the difference between HMCs and GCs in terms of pattern separation. (e1) Pairwise analysis on CA3 pyramidal cells and GCs, both under partial inhibition block and under 30 Hz inputs (same color code as in b). (e2) At τw = 100 ms and 250 ms, CA3 and GC distributions are still different (ANCOVA: p < 0.0001). The inset (see c for legends) shows an increasing effect of τw (plateauing at large timescales) on the difference between CA3 PCs and GCs.

This analysis confirmed that, at short timescales, FSs exhibit less pattern separation than GCs (Fig. 5c) and revealed significant differences between HMCs and GCs, especially for pairs of input spiketrains with low Rinput (Fig. 5d), which were not as obvious in our previous coarse analysis (Fig. 3b3). At longer timescales, the variability across neurons has a tendency to increase for all celltypes, but the average levels of decorrelation of both FSs and HMCs stayed highly significantly different from those of GCs (Fig. 5c,d). Interestingly, the difference between FSs and GCs was larger at short timescales, whereas the difference between HMCs and GCs increased with larger timescales. Indeed, at τw above 100 ms, HMCs can often exhibit pattern convergence instead of pattern separation, especially for pairs of already dissimilar input spiketrains (low or negative pairwise Rinput), whereas FSs just show weak or no pattern separation.

Concerning CA3 PCs, they still exhibit high levels of temporal pattern separation at long timescales and the difference with GCs increases (Fig. 5e). Interestingly, in contrast to all other tested celltypes and conditions (Fig. 5b–d), PCs even show a dramatic increase of their average levels of decorrelation at the 250 ms timescale (Fig. 5e).

Overall, these results show that among the tested DG celltypes, GCs exhibit the highest levels of temporal pattern separation of cortical spiketrains across all timescales. Our findings also suggest that the high level of separation by the DG is amplified in CA3.

Mechanism of temporal pattern separation: neural noise

To determine what mechanisms might support temporal pattern separation in GCs, it is necessary to understand its dynamics first. Limiting our analysis to the first presentation of an input set revealed that outputs were already significantly decorrelated (Fig. 6a,b). This shows that the separation mechanism is fast, consistent with the fact that the brain generally does not have the opportunity to average repeated signals and that separation must happen immediately during encoding. In addition, analysis of the last presentation revealed only modestly more separation than for the first one, and only for high input correlations (Fig. 6c), suggesting that learning to recognize the input pattern is not critical.

Figure 6
figure 6

Input spiketrains are efficiently separated upon their first presentations. (a) Two of five inputs are shown with corresponding output spiketrains. The first output sweep is marked with a pink bar (right) and last sweep is marked with a blue bar. (b) Routput, computed from the first sweep of five output trains only (pink), as a function of Rinput, with linear fit. All data points are below the identity line indicating that outputs are effectively decorrelated compared to their inputs even when input patterns have only been presented once each. The average decorrelation (Rinput − Routput) is significant for all input sets (one-sample T-tests, p < 0.01) except for Rinput = 0.11 (p = 0.1). (c) Left: Output correlations (Mean ± SEM) between spiketrains of the first sweep (pink) and the last sweep (blue). There is no significant difference (ANCOVA, p = 0.33). Right: When taking into account that the two distributions are paired, we detect that a few output correlations are significantly lower for the last sweep than for the first one (one-sample T-test on the difference between Routput of the first and last sweep of each recording set, asterisks signify p < 0.05). This is evidence, though weak, that repetition of input spiketrains might improve pattern separation for highly similar inputs.

Because the mechanism for temporal pattern separation is fast and does not require learning, we asked first whether intrinsic properties of GCs could play a role. Linear regression analysis revealed that the membrane capacitance, resistance, time constant as well as the resting membrane potential are not predictors of decorrelation in GCs (see low R2 in Table S1). Another hypothesis is that randomness in neuronal responses drives the decorrelation. Indeed, when the same input spiketrain is repeated (e.g. Rinput = 1) the output spiketrains are not well correlated (as shown by the mean spiketrain reliability Rw) (Fig. 7a), consistent with well-known trial-to-trial variability in single neuron responses36,37,38. Theoretical investigation of pattern separation often relies on some sort of random process such as probabilistic neuronal activation5 or stochastic firing39, which suggests that “neural noise” is a likely contributor to any form of pattern separation. However, because “neural noise” can cover multiple different definitions and phenomena36, determining its role in a complex computation is not trivial.

Figure 7
figure 7

Pattern separation in single GCs is not explained by simple neural noise. (a) The variability of output spiketrains in response to the same input train sets the upper bound for Routput. Left: Correlations between pairs of output spiketrains associated with different input trains (Routput) and pairs of different output spiketrains associated with the same input train (enclosed by green, Rw: spiketrain reliability, the reproducibility of the output given the same input). Right: Frequency distribution of Rw for all recordings (green; dark green line is the mean: <Rw> = 0.3), overlaid on the distribution of 102 (Routput, Rinput) points and its regression line (black). Note that means <Rw> and <Routput> for Rinput = 1 are close because they both assess the reproducibility of the output when the input is the same. (b) Characterization of neural noise. Top: example of input and output spiketrains illustrating variable delay of the response spike after an input spike (d1, d2) or failure to spike after an input spike (red cross). Bottom: Example from one GC recording. The spike-wise noise in output spiketrains is characterized by the average spike delay, the standard deviation of this delay (jitter) and the probability of spiking after an input spike (spiking probability, SP). (c–e) Effect of spike shuffling on Routput and Rw. For each output spiketrain, each original spike was reassigned to the time of a randomly chosen input spike. This shuffling was performed 100 times for each of the 102 original GC recording sets, producing 10,200 shuffled recording sets. (c) The (Routput, Rinput) GC distribution (green) is overlaid with the 95% sample interval of the distribution of the 10,200 shuffled recording sets (grey area: 2.5 to 97.5 percentiles). A linear regression was performed for each of the 100 shuffling distributions (102 simulated data points per shuffle): the 95% confidence interval (CI) around the mean regression line is represented by the two dashed lines (short dash) very close to each other, and the GC regression line (green) falls out of this range. As illustrated in the inset, 100% of the shuffling regression lines (grey squares) have a lower slope and higher intercept than the regression line for the original GC dataset (i.e. the green dot is outside of the grey cloud). This Monte-Carlo exact test shows that spiketrains are significantly less separated in GCs than would be expected from random spiking. (d) Frequency distributions of Rw for 102 GC recording sets (green, same as in a, solid line = mean) and 10,200 shuffled recording sets (grey). Dashed lines represent the 95% CI of the shuffling mean (see inset). Inset: distribution of the mean Rw for the 100 shufflings. GCs mean Rw (0.3) is outside of this distribution, showing that GCs output is significantly more reliable than expected from random spiking. (e) Paired statistical tests (based on difference between each GC recording and its 100 associated shuffled recording sets) show that spike shuffling leads to smaller Routput and Rw than original GC recordings. Top: Frequency distribution of the difference of Rw (10,200 data points). Monte-Carlo exact tests: 99.25% of data points (i.e. p = 0.0075) and 100% of the shuffling means are above 0. Bottom: Difference of Routput as a function of Rinput. The grey area represents the sample interval where 95% of the 10,200 data points fall (2.5–97.5 percentiles): for high Rinput values, all points are above 0 (see Table S4 for details). Solid grey line: means; dashed lines: 95% CI. Monte-Carlo exact tests based on proportion of shuffling means above 0: asterisks denote significance (see Table S4 for p-values).

The noisiness in neural communication is often understood as the unreliability of spiking after a single input spike, and the jitter of the delay between an input spike and an output spike40,41. We thus assessed such spike-wise noise directly from recording sets from our pattern separation experiments (Figs 7b and S2, and Materials and methods – Analysis of output spiketrains). Although the spike probability (SP), delay and jitter of GCs were slightly higher in our experiments than in a previous report (but recording and analysis methods were different), the variability between GCs was consistent41. We then asked whether these spike-wise noise parameters could predict the degree of decorrelation by GCs. First, linear regression analysis shows no good relationship with any parameter, the SP being an average predictor at best (Fig. S3a and Table S2). Moreover, the average firing rate of a GC output set (a measure dependent on SP) is not well correlated with the degree of decorrelation either (Table S3 and Fig. S8a). Temporal pattern separation in GCs seems to not be achieved merely because their output spiketrains are a sparser and jittered version of their inputs.

To more carefully test the hypothesis that random spiking failures and random delays support fast temporal pattern separation, we simulated recording sets based on computational models only governed by spike-wise noise statistics comparable to the original data (Figs S2 and S3b, Table S2 and Methods – Computational models). The distribution of (Rinput, Routput) was significantly higher in the original data (Figs 7 and S5), showing that simple random processes yield greater levels of separation than real GCs, especially for highly similar inputs (Figs 7e and S5a).

In addition to the spike-wise noise, we considered neural noise at the level of spiketrains by computing the average correlation coefficient Rw between “children” output spiketrains from the same “parent” input train (Fig. 7a). Rw characterizes the more complex notion of spiketrain reliability, that is the ability of a neuron to reproduce the same output spiketrain in response to repetitions of the same input spiketrain. Rw is not dependent on intrinsic cellular properties (Table S1) and only moderately determined by spike-wise noise parameters (Table S2), suggesting that the rather low Rw of GCs is the expression of more complex noisy biophysical processes. Consistently, Rw was significantly lower for shuffled and simulated data than in real GCs (Figs 7d,e and S5b). This indicates that the output spiketrains of GCs are more reliable than if their output was entirely determined by simple random processes. Overall, GCs have both a higher Routput and Rw distributions than datasets based on random spiking (Fig. 7), which clearly shows that simple noise cannot fully underlie the operations performed by GCs on input spiketrains.

The fact that random spiking yields better temporal pattern separation but less reliable information transmission than GCs also suggests that there might be an unavoidable trade-off between achieving pattern separation and reliable information transmission about input spiketrains. To further investigate this, we looked at the relationship between the spiketrain reliability Rw and the decorrelation levels in GC recordings and found a strong anticorrelation (Fig. 8a and Table S3). This linear relationship is clear evidence that biological processes leading to sweep-to-sweep variability is a powerful mechanism for temporal pattern separation in DG.

Figure 8
figure 8

Unreliability in spiketrain transmission is a major but not unique source of temporal pattern separation. (a) Spiketrain reliability (RW) is an excellent predictor of normalized decorrelation (defined in Fig. 2E) for all celltypes and conditions. Notice that, despite the strong anticorrelation, the intercept of the linear model at Rw = 1 predicts that even a perfect reliability could still allow 10% of decorrelation. See Table S3 for linear regressions on single celltypes. (b) To assess the amount of decorrelation not due to spiketrain unreliability, the ten children output spiketrains of each of the five trains of an input set can be averaged to give the five output peristimulus histograms (PSTH). The 10 ms binned PSTHs of the output rasters in Fig. 2a are shown. (c) Correlation coefficients between all pairs of the five output PSTHs. The mean correlation (PSTH Routput) is the average of coefficients inside the red border, and excludes self-comparisons. (d) Left: PSTH Routput as a function of Rinput (102 recording sets, in red), fitted with a parabola (black). All points are below the identity line indicating decorrelation of outputs compared to inputs. Right: Average effective decorrelation (Rinput − PSTH Routput) as a function of Rinput (bars are SEM) reveals a significant decorrelation for all input sets except for the most dissimilar (one-sample T-tests; shaded area is the 95% confidence interval for significant decorrelation).

However, spiketrain unreliability is not the only reason GCs output spiketrains are decorrelated. Indeed, three lines of evidence support the idea that even if GCs were perfectly reliable (i.e. a given input train always leads to the same output train), the set of output spiketrains would still be less similar than the set of inputs. First, the linear model describing the relationship between Rw and decorrelation levels suggest that even at Rw = 1, ~10% of decorrelation would still be achieved (Fig. 8a). Moreover, high levels of pattern separation are performed upon the first presentation of an input set (Fig. 6): this means that even if the full output set was composed of exact repetitions of the set of output trains recorded during the first sweep (e.g. Input trains 1 and 2 always lead to the same Output trains 1 and 2, respectively) pattern separation would still occur, at the levels measured for the first sweep (difference between Outputs 1 and 2 > difference between Inputs 1 and 2). In other words, perfect reliability does not equate the absence of pattern separation. Finally, when averaging out the variability between spiketrains associated to the same input (i.e. by computing the PSTH), a significant level of decorrelation is still detected (Fig. 8b–d). We can thus conclude that temporal pattern separation is supported by some mechanisms in addition to those that produce spiketrain unreliability.

Taken together, these results suggest that complex but noisy biophysical mechanisms allow GCs to balance temporal pattern separation and reliable signaling about their inputs.

Mechanism of temporal pattern separation: short-term synaptic dynamics

The fact that intrinsic properties of GCs do not predict their decorrelation levels (Table S1) suggests that temporal pattern separation comes from the DG network in which each recorded GC is embedded. We thus hypothesize that the specific short-term dynamics of synapses in the DG network implement temporal pattern separation in GCs. This is a likely mechanism because: (1) Synaptic transmission is a stochastic process and generally considered the main source of noisiness in the spiking output of neurons36,42, which would fit our conclusion that temporal pattern separation is dependent on neural noise (see paragraph above), (2) Short-term plasticity makes the probability of synaptic release dependent on the timing of the preceding input spikes, which could explain why temporal pattern separation and spiketrain reliability are not purely governed by noise. To test the idea that short-term synaptic dynamics is a potential mechanism controlling the pattern separation/spiketrain reliability ratio in GCs, we developed a computational model of a spiking GC with dynamic and probabilistic PP-GC synapses (see Methods – Computational models). The parameters of the model controlling the facilitation, depression and variability of presynaptic release, and thus the amplitude of EPSCs, were constrained to match the whole-cell voltage-clamp recording of a GC in response to a pattern separation protocol (10 Hz input trains, Rinput = 0.76) (Fig. 9a–c and see Methods – Computational models). Simulations of pattern separation experiments with this model resulted in high levels of output spiketrains decorrelation, akin to real GCs, but also in relatively high levels of spiketrain reliability (Fig. 9d,e). This is a proof of principle that the probabilistic and dynamic nature of presynaptic release at the PP-GC synapse is a potential mechanism balancing temporal pattern separation and information transmission in the DG.

Figure 9
figure 9

Probabilistic presynaptic release is a potential mechanism balancing temporal pattern separation and spiketrain reliability. (a) Non simultaneous current-clamp (Vrest ≈ −70 mV) and voltage-clamp recordings (Vhold = −70 mV) in response to a set of input patterns were successively performed in GCs from adult mice. The example shows the first sweep of responses from a single cell to the five trains of an input set (Rinput = 0.76, τw = 10 ms). (b–e) The recording set of the example cell in a was used to fit a computational model of a spiking neuron with dynamic and probabilistic synapses. (b) Top: four first input spikes of input train #1. Middle: Dynamics of the variables of a probabilistic Tsodyks-Markram model of a synapse in response to the four input spikes. X represents the probability of vesicle availability at the presynaptic site, U represents the probability of release of available vesicles, and Pr is the probability of release for all vesicles. The number of released vesicles was simulated at every time point from a binomial process. Bottom: Parameters of the model were adjusted so that the resulting current (averaged over the 10 repetitions, in dark orange) would reasonably match the peak of the corresponding average EPSCs in the original voltage-clamp recordings (baselined and inverted display with partially blanked stimulation artifact, in black). (c) Parameters of the model were also adjusted to match the variability in the amplitude of the first EPSC in the original recording. Top: average EPSC. Middle: individual EPSCs evoked from the 10 repetitions of the first input spike in input train #1. The range of peak amplitudes is similar between model and data. Bottom: The coefficient of variation of the current in the model is close to the data. (d–e) Temporal pattern separation and spiketrain reliability (τw = 10 ms) for GCs from adult mice and a single model regular spiking Izhikevich neuron with probabilistic synapses as described above. Data points correspond to recording/simulation sets: 35 for GCs (green circles, 14 cells, 6–16 per input set), and 5 simulations at all input sets (dark orange triangles), with the model fitted to a single recording set (Rinput = 0.76; Routput and Rw shown in dark green circles for all input sets).

Sources of pattern separation differences between celltypes

To further understand the mechanisms behind temporal pattern separation in single hippocampal cells, we investigated the sources of the difference in decorrelation levels between the tested celltypes. We considered differences in intrinsic cellular properties (Table S1), firing rate and probability of bursting (Figs S7 and 8), spike-wise noise (Figs S3 and 4), spiketrain reliability (Figs 8a and S6) and synaptic transmission dynamics (Fig. S9).

Both FS and HMC recordings displayed bursts of spikes (defined as more than one output spike between two input spikes), which was very rarely seen in our GC and CA3 recordings (Figs 34 and S7c–e, S8c,e). As a result, both FSs and HMCs had significantly higher firing rates than GCs (Fig. S7a,b) (although the effect size was smaller for HMCs, because they burst less often than FSs and they have generally less spikes per bursts Fig. S7d,e). Then, are the firing rate or the probability of bursting predictive of differences in pattern separation? The relationships are unclear, but it seems that both could partially explain the lower decorrelations observed in some HMCs and FSs (Fig. S8a,b). To more directly test the role of bursting, we processed all FS and HMC recordings by removing all but the first spike in each detected burst (Fig. S8c,e and see Methods – Analysis of the output spiketrains). These resulting “non-bursty” datasets (nbFS and nbHMC) still exhibited significantly less pattern separation than GCs (and even pattern convergence of dissimilar input trains, in the case of HMCs) (Fig. S8d,f). Therefore, bursting and high firing rates, although a source of differences in temporal pattern separation, are not a sufficient explanation.

Although FSs show less pattern separation than GCs, it is interesting that they do exhibit some amount of separation, as opposed to pattern convergence6 which one could have expected from their reputation of having a much more reliable and precise spiking behavior than principal neurons40,43. The high fidelity in relaying input spike40 might still explain the difference in pattern separation ability between FSs and GCs, although, to our knowledge, they have never been formally compared. We thus first confirmed the idea that FS show much less spike-wise noise than GCs (Fig. S4). Linear regressions then revealed that spiking probability (SP) is a good predictor of both spiketrain unreliability (1–Rw) and decorrelation performance in FSs (Fig. S3b and Table S2). Surprisingly, the membrane resistance of FSs was also a good predictor (Table S1). Thus, contrarily to GCs, FS pattern separation behavior is strongly and linearly determined by some intrinsic and spike-wise properties, even though it is in principle hazardous to anticipate complex neuronal operations from such low-level characteristics, as our previous analysis on GCs illustrated. Indeed, spike-wise noise parameters of HMCs were very close to those of GCs (Fig. S4) and they nonetheless showed striking pattern separation differences.

Across all celltypes, it is clear that higher SP correlates with lower decorrelation levels (Fig. S3d). This suggests that sparseness can be a mechanism partially supporting temporal pattern separation. Only partially, because: (1) the correlation is not perfect (R2 = 57%), (2) although the correlation is better in FSs, FSs decorrelation levels are consistently lower than expected by the linear model fitting all celltypes (Fig. S3d), (3) There is no difference in terms of sparseness between CA3 PCs and their GC controls (Fig. S7) even though there are clear differences in terms of pattern separation (Figs 4 and 5).

In the end, the best predictor of decorrelation levels for all celltypes and conditions is Rw, the spiketrain reliability (Figs 8a and S6). This result emphasizes the unexpected idea that the biophysics of neurotransmission in hippocampal networks imposes a trade-off between temporal pattern separation and reliable information transmission. Figure 9 suggests that the balance between separation and reliability can depend on synaptic dynamics. Because different celltypes have synapses with unique properties, we hypothesize that those differences produce varying degrees of pattern separation behaviors not explained by spiking sparseness (SP, FR, pBurst). For example, in contrast to PP-GCs synapses that mostly exhibit depression, the monosynaptic connection from GCs to CA3 PCs is made through giant mossy fiber buttons with low initial probability of release and short-term facilitation under high input frequency31,44 (Fig. S9a). By modelling a spiking neuron with synapses inspired by GC-CA3 connections and comparing it to the model from Fig. 9, under conditions yielding similar FR, we confirm that differences in short-term synaptic dynamics alone can lead to obvious pattern separation differences (Fig. S9b,c).

Overall, these results show that the differences in temporal pattern separation between different hippocampal celltypes result from a combination of various sources, each celltype with a unique combination.

Discussion

We report that similar cortical input spiketrains are transformed in the DG network, leading to less similar output spiketrains in GCs. Our findings provide the first experimental demonstration that a form of pattern separation is performed within the DG itself and exhibited at the level of single neurons at different timescales. This computation arises from noisy but specific biophysical processes (e.g. synaptic dynamics) in the DG network, where interneurons do not exhibit as much temporal pattern separation as the final DG output. In turn, the CA3 network seems to amplify this separation even more at the level of single PCs, suggesting that, at least in the hippocampus, it is not a computation specific to the DG.

A novel way to test pattern separation

In contrast to in vivo experiments that have difficulty identifying the cell-type of recorded units with certainty26,45,46,47 and simultaneously recording the direct inputs of these units13,14,15,16,17, in vitro brain slices that preserve the lamellar connections of the hippocampus offer a more accessible platform. For example, a similar experimental setup to ours was used to show that spatially segregated axonal inputs are represented by distinct spatiotemporal patterns in populations of DG neurons48,49. However, our study is the first to perform an experimental analysis of pattern separation within DG by directly manipulating the similarity of the inputs and comparing it to the similarity of simultaneously recorded outputs. Such a systematic approach had so far only been done in computational studies50. Although a rigorous comparison is impossible because the activity patterns considered were defined differently, the general pattern separation behavior of those models is confirmed by our experimental results: the DG itself performs a form of pattern separation, especially for input patterns that are highly similar (Fig. 2c,d).

Pattern separation in the time dimension

Until now, most studies of pattern separation in the DG assumed that neural activity patterns were ensembles of ON/OFF neurons4,5,15,16,20, sometimes considering a rate code averaged over minutes in addition to this population code13,14,23,39. Because neurons carry information at timescales shorter than minutes27,28,29,38,51 and because the sparse firing of active GCs during a brief event17,52,53 precludes an efficient rate code38, we studied pattern separation at sub-second timescales.

Relevant scales are given by the time constant over which neurons can integrate synaptic inputs28: 10–50 ms for GCs and ~100 ms for the “reader” CA3 pyramidal cells. Windows of ~10 ms and ~100 ms, corresponding respectively to gamma and theta rhythms, have been shown to organize CA neuronal assemblies27,28,54,55. Due to specific network properties allowing persistent activity, the DG might also integrate information over several hundreds of milliseconds48,49,56. The point is that multiple timescales can be relevant simultaneously, and because it is still uncertain which ones are the most important to episodic memory and hippocampal coding, we investigated a range from 5 ms to 250 ms (Fig. 5).

Most of our results are reported at a 10 ms resolution, which corresponds approximately to the spike jitter in GCs (Fig. S2) as well as their membrane time constant and the gamma rhythm. This choice of temporal resolution is similar to a recent computational study of pattern separation within a DG model, which used a 20 ms resolution on short spiketrains (30 ms inputs, 200 ms outputs)21. Yet, our study was the first to investigate pattern separation at a range of timescales. We found that temporal pattern separation in the DG output was best at short timescales. This relationship was generally conserved across celltypes, with HMCs even achieving the opposite of pattern separation, pattern convergence, at timescales above 100 ms. Note, however, that temporal pattern separation is not necessarily a monotonically decreasing function of the time resolution, as CA3 PCs exhibited a surprising sharp increase of temporal pattern separation for low input similarity at 250 ms (Fig. 5).

Our study of pattern separation is the first to focus on temporal patterns, as opposed to spatial ones; but neural activity patterns are spatiotemporal. More work is clearly needed to test whether the DG is a pattern separator at the spatiotemporal, population level. Notheless, the discovery of temporal pattern separation in single hippocampal neurons has some implications for population dynamics. The decorrelation of spiketrains at small timescales, in addition to the fact that different GCs respond differently to the same inputs (Fig. S1), suggests that spikes are constantly rearranged in different time windows, thus enforcing very small neuronal assemblies28. In other words, it ensures that a minimal number of output neurons are active at the same time, and such sparsity in active neuronal population is known from computational studies to be critical for efficient population pattern separation5,22,57.

Mechanisms of temporal pattern separation

The mechanisms supporting pattern separation within DG had so far never been experimentally investigated. The decorrelation of sequentially presented input patterns can in theory be explained by: (1) adaptive mechanisms, involving learning and recognition of input patterns, comparison with previously stored ones and the pruning out of common features, (2) non-adaptive (intrinsic) mechanisms, (3) or both58. First, concerning adaptive mechanisms, it has been suggested that Hebbian learning could enhance population pattern separation in the DG59, but computational models testing different forms of long-term synaptic learning found that it would actually impair this type of pattern separation5,23. As for temporal pattern separation, our data show that it hardly benefits from the repetition of input patterns (Fig. 6). We also offer indirect evidence that non-adaptive decorrelation processes support temporal pattern separation because output patterns are always decorrelated to the same proportion (Fig. 2e), a feat that a simple random process can achieve (Fig. S5c), suggesting that input patterns do not need to be recognized. Third, adaptive and non-adaptive mechanisms are not mutually exclusive: previous learning over days, during the neuron maturation process, could tune single GCs only to specific input patterns, allowing rapid pattern separation60. Indeed, a computational study suggested that adaptive networks can mature to perform a fast, non-adaptive orthogonalization of the population activity by the decorrelation between individual information channels61.

Adaptive or not, what is the biological source of the temporal decorrelation we observed? We first determined that intrinsic membrane properties do not predict decorrelation levels (Table S1), and that celltypes ability to fire bursts only moderately affects pattern separation (Fig. S8). Simple randomness was not sufficient to reproduce our results, even though the spiking probability, a form of spike-wise neural noise, plays a partial role (Figs 8 and S3). In the end, temporal pattern separation seems most likely supported by specific short-term synaptic dynamics, thanks to the synergy between the probabilistic nature of neurotransmission and its dependency on the spike-timing history imposed by depression or facilitation (Figs 9 and S9). Various levels of pattern separation/convergence might thus be achieved in different celltypes due to a unique combination of synaptic properties.

Indeed, GCs are embedded in a network of synapses, notably receiving feedforward excitation from the PP, feedforward and feedback inhibition from FSs43,62 and both excitation and disynaptic inhibition from HMCs63. Thus, although both FSs and HMCs showed some temporal pattern separation (mostly at short timescales), our results suggest that the final DG output is inherited from synaptic interactions between all DG celltypes, resulting in maximal separation at the level of GCs.

More work is needed to clarify the exact role of FSs and HMCs in DG computations. Computational and experimental studies have suggested that HMCs are involved in some forms of pattern separation17,20,47. Our finding of pattern convergence in HMCs for dissimilar inputs may appear in contradiction with the recent reports that HMCs spatial representations remap in dissimilar environments17,26,46,47, but (1) activity patterns were defined differently and averaged over longer timescales (minutes) and (2) as argued above, remapping is not a direct measure of pattern separation without knowledge of the input patterns. Overall, we show that HMCs can exhibit both separation or convergence, depending on the time resolution and the amount of similarity between input patterns (Fig. 5). The impact of such a behavior on pattern separation at the level of GCs remains to be studied. Concerning FSs, they exhibit a poor ability to separate spiketrains (Figs 3 and 5), but the somatic inhibition they provide could be the source of low spiking probability in GCs and thus improve pattern separation by making GCs responses sparse5,64. On the other hand, their ability to relay information reliably43 (Figs S4 and S6) and to precisely control spike timing in target neurons43 might actually provide a mechanism that counteracts noisiness in GCs, helping them balance effective separation with fidelity of information transmission to CA3.

The role of sweep-to-sweep variability

Because the brain needs to be able to recognize when situations are exactly the same, our finding that pattern separation occurs even when the same input pattern is repeated (Fig. 7a) might seem counter-intuitive at first. However, in theory, the separation and the recognition functions do not have to be supported by the same network. The Hebb-Marr framework actually hypothesizes that the CA3 recurrent, auto-associative network is able to recall the original pattern from a noisy input from DG. Even though most computational models that tested the effect of repetition were consistent with the intuitive view5,21, this was likely because they used deterministic neurons. A model considering variability across GCs and a probabilistic spiking behavior resulted, as in our experiments, in separation of repetitions of the same pattern39.

In the cortex, the well-known variability of single neuron activity between trials is often supposed to be “averaged out” at the population level so that the output of the population is reliable36. It is thus conceivable that considering an ensemble of GCs would increase the signal-to-noise ratio. In fact, when we average out the sweep-to-sweep variability, GCs exhibit pattern separation for highly similar patterns but almost no separation for identical ones (Fig. 8d).

However, this variability, or “noise”, is not necessarily meaningless36. Our results suggest it might be a mechanism amplifying pattern separation (Fig. 8). The variability might even be just apparent, if we consider that when the same input is repeated it is at different points in time: each repetition could be considered as a different event that needs to be encoded slightly differently. The DG would thus meaningfully add some noise to transform input spiketrains so that cortical information about an event is stored in the hippocampus with a unique random time-stamp, consistent with the index theory of episodic memory65.

Pattern separation and pattern completion in CA3

In the Hebb-Marr framework, CA3 is thought to be a recurrent, autoassociative network that can perform pattern completion, and thus can support the recall of a full memory from a partial cue of the original event4. Although CA3 recurrent connections are sparser than previously thought66, computational models generally confirmed that CA3 could perform pattern completion of population patterns4,23,66. Direct experimental evidence are scarce and unclear, both in vitro67, and in vivo32,68,69. In addition, the fact that neuronal representations of similar environments are more correlated in CA3 than in the DG was thought to constitute indirect evidence13,14, but recent reports suggesting that the DG representations previously measured were not coming solely from DG output cells have clouded these initial conclusions26,46.

Assuming that pattern completion is a process realized by CA3, this implies that, when presented with different but similar partial cues of the same initial memory, the final output of CA3 should converge towards the same representation. Our finding that CA3 PCs exhibit high levels of temporal pattern separation might then come as a surprise (Figs 4 and 5). Several lines of reasoning could explain this result. First, we focused on temporal patterns in single cells, and it is possible that a network can perform population pattern completion in addition to temporal pattern separation. Actually, different environments can be represented in CA3 by different populations of PCs, or by the same PCs with remapped firing rates26,46, which has led some to conclude that CA3 could perform pattern separation13. A recent report even showed that single GCs remap much less than CA3 PCs26: assuming GCs and PCs receive inputs with similar overlap, this is consistent with our finding that PCs are better than GCs at temporal pattern separation. Second, we tested CA3 under partial block of inhibition in order to allow PCs to fire. Given that the number of active CA3 PCs is generally higher than DG GCs in vivo5,24,26,46, it suggests that our experiment may not have modelled physiological conditions well. Third, CA3 is known to be physiologically and functionally heterogeneous along its proximodistal axis32, with studies suggesting that the PCs closest to DG perform population pattern separation68,70,71. We recorded from PCs at the CA3b/c border, and it is possible that more distal PCs would exhibit less temporal pattern separation. Last but not least, it is important to note that pattern completion and pattern separation are not opposite, mutually exclusive computations. Pattern separation and its actual opposite, pattern convergence, describe the similarity of multiple patterns from an input network (e.g. EC) compared to the similarity of the corresponding patterns of another network (e.g. DG)6. On the other hand, pattern completion is the process, happening over time, of retrieval of a previously learned full pattern in a network (e.g. CA3) from a partial seed pattern in the same network5. Our experiments did not test for pattern completion, as the different trains of an input set were not degraded versions of a previously learned pattern. Moreover, pattern separation and completion are complementary in the sense that pattern completion would benefit from initial input patterns being as separated as possible4,5. It thus makes sense that CA3 would start by amplifying the separation of input patterns, as is the case in our data, either before encoding or before proceeding to completion of the seed pattern. In fact, our results suggest that CA3 might complement the separation effectuated in the DG, as CA3 is able to perform high decorrelation levels at long timescales (>100 ms) when the DG does not (Fig. 5). CA3 could thus ensure that seed input patterns are well separated at all timescales.

Materials and Methods

Animals and dissection

All experiments were performed in accordance with the National Institute of Health guidelines outlined in the National Research Council Guide for the Care and Use of Laboratory Animal (2011) and regularly monitored and approved by the University of Wisconsin Institutional Animal Care and Use Committee.

Horizontal slices of the ventral and intermediate hippocampus (400 μm) were prepared from the brains of young (p15–25) or adult (p121 ± 15 days) C57BL/6 male mice (Harlan/Envigo). Adult animals were only used for Fig. 9. Mice were anesthetized with isoflurane, decapitated, and the brain was removed quickly and placed in ice-cold cutting sucrose-based solution (two different compositions were used. For HMC, CA3 pyramidal cells, their GC controls and GCs from adult mice, we used version #272). The cutting solutions contained (version #1/version #2, in mM): 83/80 NaCl, 26/24 NaHCO3, 2.5/2.5 KCl, 1/1.25 NaH2PO4, 0.5/0.5 CaCl2, 3.3/4 MgCl2, 22/25 D-Glucose, 72/75 Sucrose, 0/1 Na-L-Ascorbate, 0/3 Na-Pyruvate, bubbled with 95% O2 and 5% CO2. During the dissection, brain hemispheres were prepared following the “magic cut” procedure73 with α-angle around 10 to 15° and β-angle around 5 to 10° for GC and FS recordings, and with α close to 0° and β between 0 and −5° for HMC and CA3 pyramidal cells recordings66. Slices were cut using a vibratome (Leica VT1000S) then placed in an incubation chamber in standard artificial cerebrospinal fluid (aCSF) containing (in mM) 125 NaCl, 25 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 2 CaCl2, 1 MgCl2, and 25 D-Glucose (or in a 50/50 mix of standard aCSF and cutting solution. 100% cutting solution for HMC, CA3 recordings and their GC controls, and GCs from adult mice) at 35 °C, for 15–30 minutes after dissection. Slices were stored in the incubation chamber at room temperature for at least 30 minutes before being used for recordings.

Electrophysiology

All recordings were done in standard aCSF adjusted to 325 mOsm, at physiological temperature (33–35 °C). Whole cell patch-clamp recordings were made using an upright microscope (Axioskop FS2, Zeiss, Oberkochen, Germany) with infra-red differential interference contrast optics. Patch pipettes pulled from thin-walled borosilicate glass (World Precision Instruments, Sarasota, FL) had a resistance of 2–5 MΩ when filled with intracellular solution containing (in mM) 140 K-gluconate, 10 EGTA, 10 HEPES, 20 phosphocreatine, 2 Mg2ATP, 0.3 NaGTP. For the dataset from adult mice, we used a slightly different recipe: 135 K-gluconate, 5 KCl, 0.1 EGTA, 10 HEPES, 20 Na-Phosphocreatine, 2 Mg2-ATP, 0.3 Na-GTP, 0.25 CaCl2. Intracellular solutions were adjusted to pH 7.3 and 310 mOsm with KOH and H2O. Both recipes yielded similar electrophysiological behaviors.

Recordings were done using one or two Axopatch 200B amplifiers (Axon Instruments, Foster City, CA), filtered at 5 kHz using a 4-pole Bessel filter and digitized at 10 kHz using a Digidata 1320 A analog-digital interface (Axon Instruments). Data were acquired to a Macintosh G4 (Apple Computer, Cupertino, CA) using Axograph X v1.0.7 (AxographX.com). Stimulation pipettes were pulled from double barrel borosillicate theta-glass (~10 μm tip diameter, Harvard Apparatus, Edenbridge, U.K.) and filled with ACSF or a 1 M NaCl solution and connected to a constant current stimulus isolator used to generate 0.1–10 mA pulses, 100 microseconds in duration.

Series resistance and cellular intrinsic properties were assessed online in Axograph from the fit of the electrical responses to repetitions of 5–10 mV, 25 ms steps, holding the potential at −65 mV. Neurons used for analysis were stable across a whole recording session as judged by monitoring of series resistance and resting potential.

Dentate granule cells (GC) were visually identified as small cells in the granule cell layer (GCL). GCs from young mice had the following intrinsic properties (mean ± sem): resting potential (Vrest) −69.3 ± 1.3 mV; input resistance (Ri) 171 ± 16 MΩ and capacitance (Cm) 23 ± 2 pF. GCs from adult mice had the following intrinsic properties: Vrest −78.8 ± 1.8 mV; Ri = 137 ± 14 MΩ and Cm = 18 ± 1 pF.

Fast-spiking interneurons (FS) were identified as neurons with large somata at the hilus-GCL border and a high firing rate response during large depolarizing current steps, and a large after-hyperpolarization (AHP)62,74 (Fig. 3). They had the following intrinsic properties: Vrest = −66.7 ± 3.5 mV; Ri = 59 ± 10 MΩ and Cm = 19 ± 3 pF.

Hilar Mossy Cells (HMC) were identified as large neurons in the deep hilus (>60 µm away from the GCL) with a regular firing response to current steps and a small AHP, as well as with a high frequency of spontaneous EPSPs, all characteristics that allow to distinguish them from other neurons in the hilus63,75 (Fig. 3). Their intrinsic properties were: Vrest = −69.7 ± 2.3 mV; Ri = 198 ± 12 MΩ and Cm = 33 ± 3 pF.

For CA3 pyramidal cells, because recent studies suggest a proximodistal gradient of physiological and computational properties in the CA3 network32,68,70,71, we avoided the extremity of the CA3c region and targeted large neurons around the CA3b/c border in the pyramidal layer. Their intrinsic properties under partial inhibitory block (aCSF + 100 nM gabazine) were: Vrest = −72.7 ± 2.2 mV; Ri = 186 ± 12 MΩ and Cm = 36 ± 2 pF. The intrinsic properties of GCs recorded under the same pharmacological conditions were: Vrest = −77.3 ± 1.9 mV; Ri = 244 ± 13 MΩ and Cm = 25 ± 2 pF.

Pattern separation experiments

We designed multiple sets of input patterns, each with a prespecified average Pearson’s correlation coefficient (Rinput) computed with a binning window (τw) of 10 ms, using two different algorithms (one developed in-house based on iterative modifications, and a more efficient one from76 based on mathematical principles and a preset covariance matrix). An input set consisted of five different input spiketrains, each 2 s trains of impulses simulating cortical spiketrains, with interspike intervals following a Poisson distribution. Each pair of input trains had a correlation coefficient close to the Rinput of its set (at τw = 10 ms, the average relative standard error is 4% across all input sets). For all experiments except in Fig. 4 (CA3 and GC control), the mean frequency of input trains was set close to 10 Hz (11.9 ± 0.7 Hz). This input firing rate was chosen to be consistent with the frequency of EPSCs recorded in GCs of behaving mice52, and is known to promote a high probability of spiking in GCs in slices62,77.

The responses of one or two DG neurons were recorded in whole-cell mode while stimuli were delivered to the outer molecular layer (OML). Stimulus current intensity and location were set so that the recorded neuron spiked occasionally in response to electrical impulses (see range of spike probability in Figs S2 and 4) and the stimulation electrode was at least 100 µm away from the expected location of the dendrites of the recorded neuron. Once stimulation parameters were set, a pattern separation protocol was run: the five trains of a given input set were delivered one after the other, separated by 5 s of relaxation, and this was repeated ten times. The ten repetitions of the sequence of five patterns were implemented to take into account any potential variability in the output, and the non-random sequential scheme was used to avoid repeating the same input spiketrain close in time. Each protocol yielded a recording set of fifty output spiketrains, each associated with one of the five input trains of an input set (Fig. 1c). A given cell was recorded in response to up to five input sets with different Rinput (i.e. a recorded cell produced between one and five data points on Fig. 2c).

The membrane potential baseline was maintained around −70 mV during both current-clamp and voltage-clamp recordings, consistent with the Vrest of mature GCs recorded in behaving mice52. For comparison, FS and HMC current-clamp recordings were also held at −70 mV. The output spiking frequency of GCs was variable (6.3 ± 0.3 Hz, see Figs S7 and 8) but consistent with sparse activity generally observed in GCs in vivo during behavior26,52,53,78 and in slices under conditions of drive comparable to ours41,79,80. The output firing rates of FS and HMC were higher than their GC controls (Figs S7 and 8), as expected from recent research26,43,47.

To perform pattern separation experiments in CA3 pyramidal cells (PCs), we had to change several parameters in order to make CA3 PCs spike. Indeed, PCs firing is controlled by strong feedforward inhibition31,32, and all tested stimulation sites led to net IPSPs or, rarely, to weak EPSPs. To make PCs fire in response to external electrical stimulations: (1) the stimulating electrode was placed in the inferior blade of the GCL to make the DG output cells fire, (2) we targeted CA3b PCs, which have the highest E/I ratio across CA3 in slices and still receive strong connections from mossy fibers32, (3) we used input sets with 30 Hz input trains to promote the depression of inhibitory transmission31, (4) we maintained the membrane potential baseline between −60 and −70 mV, and (5) we had to add 100 nM gabazine to the bath of standard aCSF in order to slightly decrease IPSCs amplitude (recordings started at least 15 min after the slice was placed in the bath, to insure equilibrium was reached). 100 nM of gabazine (SR-95531, Sigma-Aldrich) has been shown to correspond to a 70% availability of GABA-A receptors81 and, in our conditions, consistently decreased spontaneous IPSCs mean amplitude by 30% in voltage-clamp recordings of GCs and CA3 PCs held at −40 mV. These conditions allowed us to record for the first time the spiking output of CA3 PCs in response to complex input spiketrains while preserving some inhibition in the network31. PCs output firing rate were on average below 10 Hz (Figs S7 and 8), but close to mean rates observed in vivo during behavior13,14,26.

Analysis of the output spiketrains

For each recording set, the similarity between pairs of spiketrains was computed as the Pearson’s correlation coefficient between the spiketrains rasters binned at a τw timescale. Sweeps without spikes were excluded from further analysis.

We did not use separate protocols to assess the firing rates, probability of bursting and spike-wise noise parameters (spike probability, delay, jitter), but computed them directly for each recording set of spiketrains from a pattern separation experiment. The mean firing rate was computed as the average firing rate across all fifty output spiketrains. A burst was defined as the occurrence of more than one output spike in the interval of two input spikes (see Fig. S8).

We define the spike-wise neural noise as the probability of spiking at least once after an input spike, the delay of an output spike after an input spike and its average jitter. To assess these parameters, we computed the cross-occurrence between input spikes and output spikes in a [−15 ms, 50 ms] interval with 1 ms bins. The resulting histogram of counts of output spikes occurring in the vicinity of an input spike was fitted with a Gaussian distribution N(µ,σ, baseline), where µ is the mean delay of an output spike and σ is the jitter of this delay. The baseline corresponds to the background firing, occurring by chance or caused by neighboring inputs. After subtracting the baseline and extracting the probability of spiking by dividing the counts of output spikes by the total number of input spikes, we defined the spike probability (SP) as the sum of probabilities of an output spike in the predefined time interval around an input spike (Figs 7b and S2a). For FS and HMC, spike-wise noise parameters were computed on a non-bursty dataset (nbFS and nbHMC) where all but the first spike in a burst were excluded. Spike-wise noise parameters were not computed for CA3 PCs and their GC control because the input frequency was too high to allow a good fit of the distribution.

Computational models

To assess the role of spike-wise neural noise in pattern separation, we generated two data sets. First, we simulated output spiketrains in response to our input sets (for each input set, we simulated ten output sets of fifty synthetic spiketrains). This simulation was entirely based on the average spike-wise noise parameters computed from the original GC recordings (see above): the matrix of input spike times was replicated ten times, and for each of the fifty resulting sweeps, spikes were deleted randomly following a binomial distribution B(Nspk, F), where Nspk is the number of input spikes in a sweep, and F the probability of not spiking (F = 1 − mean SP = 1 − 0.42). A random delay, sampled from a Gaussian distribution N(µ, σ), was added to each resulting spike times, with µ and σ being respectively the mean delay and mean jitter in the original recordings. The noise statistics of the resulting simulated data set is shown in Fig. S2c. Second, we created a surrogate data set by randomly shuffling the output spikes of the original GC recordings: the delay of each spike was conserved but it was relocated to follow a randomly selected input spike in the same input train (from a uniform distribution). This manipulation yielded a dataset with noise statistics very close to the original data (Fig. S2d). Using this strategy, we performed spike shuffling in each GC recording set a hundred times, yielding a dataset of 10,200 simulated recording sets, or, in other words, 100 datasets of 102 simulated recording sets directly paired to the original 102 GC recording sets.

To test whether probabilistic synaptic dynamics is a potential mechanism of temporal pattern separation, we used an Izhikevich model of a regular spiking neuron82 with an adapted Tsodyks-Markram (TM) synapse model83 designed to capture short-term plasticity of stochastic neurotransmission at the LPP-GC synapse. The TM synapse model consists of a system of two ordinary differential equations describing the dynamics of X and U. X is the probability of a presynaptic vesicle being available for release, the decrease of which leads to synaptic depression, and U is the probability of release for an available vesicle, the increase of which models facilitation:

$$\frac{dX}{dt}=U(t).X(t).\delta (t-{t}_{AP})+\frac{1-X(t)}{{\tau }_{rec}}$$
(1)
$$\frac{dU(t)}{dt}={U}_{SE}.(1-U(t)).\delta (t-{t}_{AP})-\frac{U(t)}{{\tau }_{facil}}$$
(2)

with X(0) = 1 and U(0) = 0, where τrec is the time constant of vesicle recovery after release, τfacil is the decay of U controlling synaptic facilitation, USE a factor determining the probability of release Pr at the time of the first spike, δ the Dirac delta function, and tAP the time of arrival of an input spike. The system of ODEs was solved using the ode23 MATLAB solver.

Based on the quantal theory of neurotransmission, the model assumes that, at each time point t, k vesicles are released, drawn from a binomial distribution B(N, Pr), where N is the number of release sites and Pr is the probability of release defined as the product of X and U. The current I is then the product of k with the quantal size q. Based on previous data from our lab, we considered q = 20 pA62. The version of the TM model we implemented only focuses on the dynamics of the presynaptic vesicles, ignoring postsynaptic dynamics. To match the decay of real EPSCs recorded at the soma, I was convolved with a template EPSC shape:

$$-A.(\,{e}^{-T/{\tau }_{rise}}-\,{e}^{-T/{\tau }_{decay}})$$
(3)

where T = 50 ms, τrise = 0.5 ms, τdecay = 5 ms and A is a factor normalizing the area of the resulting shape to 10. In addition, a synaptic delay of 4 ms was introduced. The resulting current (Fig. 9b,c) is then passed to a non-dimensional regular spiking Izhikevich neuron, an acceptable model for GCs84. We used the following parameters: a = 0.02, b = 0.2, c = −65 mV, d = 6, time resolution = 0.1 ms, max voltage = 30 mV, initial values V(0) = −70 mV and x(0) = b.V(0).

To optimize the non-fixed parameters of the TM-Izhikevich model, we simulated output spiketrains in response to ten repetitions of a single input set (10 Hz, Rinput = 0.76) and compared to the current dynamics and spiking behavior of a representative GC that was recorded both in voltage-clamp and current-clamp in response to the same input set. For the TM model, the following parameters were adjusted to match the dynamics and the peak of the current averaged over ten repetitions, as well as the variability in the peak amplitude of the first EPSC: τrec = 500 ms, τfacil = 9 ms, USE = 13, N = 40 release sites. Finally, because the Izhikevich model integrates non-dimensional currents, the input current had to be scaled using the dividing constant K = 62, such that the average firing rate of the simulation matched the average firing rate of the representative GC to the same input set (3.5 Hz). K can be considered as a constant modelling a tonus of inhibition: the higher K is, the more difficult it is for an EPSC to make the Izhikevich neuron fire. In the end, the standard deviation of the firing rate was ~0.6 Hz for simulations, compared to 0.8 Hz for the original recording. The only source of variability was from presynaptic dynamics of vesicle release modelled as a binomial process.

To determine the influence of synaptic transmission parameters on pattern separation, we compared two TM-Izhikevich models with different parameters. Model 1 has exactly the same parameters as described above except that, in order to model the presence of gabazine, the inhibition constant K was set at 40: with this value, Model 1 responses to 30 Hz input trains have a FR ~ 7 Hz, corresponding to the mean FR of real GCs recorded under partial inhibitory block (Fig. S7). For model 2, q = 29 pA, τrec = 100 ms, τfacil = 500 ms, USE = 0.01, K = 57, and all other parameters were the same as for model 1. USE, q and N (kept at 40) were chosen based on estimates from the literature on mossy fiber buttons, the giant GC-CA3 PCs synapses31,44. τrec, τfacil and K were adjusted to model short-term synaptic facilitation observed at mossy fiber buttons and to match the average firing rate that we observed in our CA3 current-clamp recordings made under gabazine (~7 Hz, Fig. S7). In order to specifically study the impact of pre-synaptic dynamics, post-synaptic (i.e. EPSC shape) and Izhikevich parameters were kept the same for all models. Note that none of our models are intended to closely match spiking behaviors observed in real GCs or CA3 PCs.

Software and statistics

Data analysis was performed using custom-written routines in MATLAB (2017a), including functions from toolboxes cited above. Sample sizes were chosen based on the literature and estimations of the variance and effect size from preliminary data. All values are reported as mean ± S.E.M. unless otherwise noted. The one-sample Kolmogorov-Smirnov test was used to verify the normality of data distributions. Parametric or non-parametric statistical tests were appropriately used to assess significance (p-value < 0.05). Assumptions on equal variances between groups were avoided when necessary. All T and U tests were two-tailed. To determine whether two distributions of data points are significantly different (e.g. Routput as a function of Rinput, for GC compared to FS, see Figs 36 and S5, S8), we performed an analysis of the covariance (ANCOVA) using linear regressions on the two data sets as well as on the combined data set, and assessed significance via an F-test comparing the goodness of fits85. Because Rinput can also be considered as a categorical variable, we performed a two-way ANOVA before using post-hoc tests correcting for multiple comparisons in order to determine at which Rinput groups two conditions were significantly different. When comparing FSs, HMCs or CA3 with GCs, different GCs were used as control and were recorded under different protocols, which is why we did not need to control for multiple comparisons across celltypes. In order to determine whether distributions were significantly different in the case of our spike shuffling analysis (Fig. 7), we designed Monte-Carlo exact hypothesis tests86. Table S4 provides details on all statistical tests conducted in this study.