## Abstract

State-of-the-art techniques allow researchers to record large numbers of spike trains in parallel for many hours. With enough such data, we should be able to infer the connectivity among neurons. Here we develop a method for reconstructing neuronal circuitry by applying a generalized linear model (GLM) to spike cross-correlations. Our method estimates connections between neurons in units of postsynaptic potentials and the amount of spike recordings needed to verify connections. The performance of inference is optimized by counting the estimation errors using synthetic data. This method is superior to other established methods in correctly estimating connectivity. By applying our method to rat hippocampal data, we show that the types of estimated connections match the results inferred from other physiological cues. Thus our method provides the means to build a circuit diagram from recorded spike trains, thereby providing a basis for elucidating the differences in information processing in different brain regions.

## Introduction

Over the past decade it has become possible to record from much larger numbers of neurons than in the past^{1,2,3,4,5}, even though this number is still a mere shadow of the total number of neurons present. The premise behind collecting these large data sets is that this could lead to improvements in correlating neuronal activity with specific sensations, motion, or memory, and possibly lead to improvements in adaptation and learning as well^{6,7,8,9,10}.

Having such large data sets leads to difficulties in handling the data and interpreting the results. There are two main approaches to handle large amounts of recording data. In the first approach, researchers have developed methods to reduce dimensionality while minimizing the loss of information^{11,12,13}.

The second approach, which we take here, is to use all of the data to carry out mesoscopic neuroanatomy, that is, to reveal the fine neuronal circuitry in which neural circuit computation is carried out. From these high channel count recordings, one should be able to estimate neuronal connectivity by quantifying the degree to which firing from a given neuron is influenced by the firing of neurons from which the index neuron is receiving input^{14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30}. For this purpose, we develop an analytical tool that estimates neuronal connectivity in measurement units of postsynaptic potentials (PSPs). In this study we also investigate how much data are needed to reliably estimate the connections between pairs of neurons. Because reconstructing connectivity is not guaranteed to reflect anatomical connectivity^{31,32,33}, we evaluate the accuracy of estimation by directly comparing the estimated connections with the true connections, using synthetic data generated by simulating a network of Hodgkin–Huxley (HH)-type neurons or a large network of leaky integrate-and-fire (LIF) neurons. Finally, we apply this method to spike trains recorded from rat hippocampus. For the experimental data, we compare our estimates of whether an innervating connection is excitatory or inhibitory with the results obtained by manually analyzing other physiological information such as spike waveforms, autocorrelograms, and mean firing rate.

## Results

### Estimating neuronal connections

To estimate neuronal connectivity between each pair of neurons, we obtain the cross-correlation (CC) by collecting spike times of a neuron measured relative to every spike of a reference neuron (Fig. 1a). We explore the CC for the evidence of a monosynaptic impact of a few milliseconds using the generalized linear model (GLM). Here, neuronal connectivity is detected by fitting a coupling filter, while slow, large-scale wavy fluctuations that are often present in recorded spike trains are absorbed by adapting the slow part of the GLM. We call our method “GLMCC” (METHODS).

### Criterion for the presence of connections

A neuronal connection is considered significant when the estimated parameter falls outside the confidence interval of a given significance level for the null hypothesis that the connection is absent. If the parameter remains within the confidence interval, the state of the connection is undetermined (METHODS).

The number of pairs considered to be connected will depend on the significance level \(\alpha\) and on the strength of the correlation. Estimation methods presume connections as if they were all direct ones, causing strong indirect influences to be purported direct connections. Neurophysiologists often try to avoid these false positives (FPs) by shifting the significance level to small values, that is, by moving \(\alpha\) to very stringent levels. However, being conservative about FPs means that existing connections important for information processing will be missed, thereby producing many false negatives (FNs).

To capture the manner in which the numbers of FPs and FNs change with the level of conservatism used for estimating connections, we applied our inference model to spike trains obtained from a network of HH neurons, in which the true anatomical connectivity is known. With this knowledge, we searched for the optimal level of conservatism or the significance level that may balance the conflicting demands for reducing FPs and FNs.

Our simulation used a network of 1000 HH neurons consisting of 800 excitatory and 200 inhibitory neurons (cf. Fig. 1b). In the simulation, excitatory neurons innervated 12.5% of other neurons with excitatory postsynaptic potentials (EPSPs). These excitatory connections were log normally distributed^{34,35,36,37} (Fig. 1c). Inhibitory neurons randomly innervated 25% of other neurons with inhibitory postsynaptic potentials (IPSPs). These inhibitory connections were normally (Gaussian) distributed^{38}. We simulated the network for a period representing 5400 s (90 min) with step sizes of 0.01 and 0.001 ms for excitatory and inhibitory neurons, respectively (METHODS). Our simulation reproduced irregular neuronal firing and skewed distribution of firing rates, which are consistent with balanced state network models^{39} (Supplementary Fig. 1).

To illustrate the performance of estimating connections, we sampled 20 neurons out of the entire population. Figure 2a shows the estimated connection matrices obtained using different significance levels, in reference to the true connectivity. Here we have not considered weak excitatory connections whose EPSPs are smaller than 1 mV, because the amount of spike recording is insufficient for identifying connections of this level. The connection matrix is divided into four quadrants representing connections between inhibitory–excitatory, excitatory–excitatory, excitatory–inhibitory, and inhibitory–inhibitory neurons. True connections for the second and third quadrants are excitatory, and those of the fourth and first quadrants are inhibitory. For \(\alpha =0.01\), too many false connections were assigned to pairs of neurons; there were 15 false connections (4.3%) in this sample. At the other extreme, all FPs can be excluded by decreasing the significance level (down to \(\alpha =1{0}^{-24}\)). In the latter case most existing connections are lost, and a large number of FNs arise; 22 among 32 existing connections (69%) are missed in this example. The numbers of FPs and FNs for excitatory and inhibitory categories are shown below for the connection matrices, indicating that the total number of FPs and FNs may be minimized between these extreme cases.

To balance the FPs and FNs simultaneously, we selected the significance level that maximized the Matthews correlation coefficient (MCC)^{27,}^{40}. The significance level was set to \(\alpha\, =\,0.001\) (Fig. 2b). Although false connections remain, the neuronal circuit was most accurately reconstructed with \(\alpha \,=\,0.001\). We adopted \(\alpha\, =\,0.001\) throughout the following analyses.

### Duration of spike recording

The necessary duration of spike recording can be estimated even without fitting the statistical model to the spike trains. This is because the distribution of the connection parameter for the null hypothesis is obtained solely in terms of the observation interval (\(T\)) and the firing rates of the pre and postsynaptic neurons (\({\lambda }_{{\rm{pre}}}\) and \({\lambda }_{{\rm{post}}}\)) (METHODS). The confidence interval of the connection parameter (\(J\)) is

where \(\tau\) is the time scale of synaptic impact, which is chosen by maximizing the model likelihood: \(\tau\, =\,0.004\) s for the simulation data and \(\tau \,=\,0.001\) s for the rat hippocampal data. The coefficient \(c\) is given as 5.16 for \(\alpha \,=\,0.001\).

We assume that connection parameter \(J\) is proportional to the PSP, \(w\) mV^{41}:

The coefficient \(a\) is determined using synthetic data as \(a\,=\,0.39\) for the EPSP and \(a\,=\,1.57\) for IPSP. By combining this with Eq. (1), the necessary duration of spike recording needed to determine the likely presence of a connection of PSP is given as

According to the coefficient \(a\), which is larger for IPSP than for EPSP, the inhibitory connection is detected more easily than the excitatory connection, given the same PSP \(|w|\). This is in conflict with the results of some other studies^{16,42,43}. The disagreement is due to the difference in simulation models; in our simulation model, the time scale of the inhibitory synapse is chosen to be longer than that of the excitatory synapse on the basis of physiological experiments^{44,45}. Accordingly, the inhibitory response is slower and has a larger integrated effect than the excitatory response. Our GLMCC should be able to properly detect the overall integrated effect (Supplementary Fig. 2).

To make reliable inference, in addition to the above relation, it is also necessary to have collected a sufficiently large number of spikes during the interaction time window on the order of a few milliseconds. Here we require (METHODS):

Table 1 shows the results of several cases of firing rates and the assumed PSPs using the \(\alpha =0.001\). Unsurprisingly, to detect a weak connection for a low firing neuron requires gathering data for a long period of time. Figure 3a shows the connections estimated with different observation time windows, illustrating how weak connections become visible as the recording duration increases.

### Estimating PSPs

We believe that our method is of particular interest because it couches the connections in terms of PSPs for the individual neuronal pairs. Figure 3b compares the estimated PSPs (\(\hat{w}\)) against the true values (\(w\)) from the numerical simulation. Here we represent \(\hat{w}\,=\,0\) if the connection is undetermined, i.e., not significant. Thus, unconnected links (\(w\,=\,0\)) that were classified as undetermined (true negatives) are placed at the origin. Points lying on the nonzero \(x\)-axis are existing connections that were not detected. Points lying on the nonzero \(y\)-axis are the functional or virtual connections that were estimated for unconnected pairs. The points in the first and third quadrants represent true positives, or existing connections whose signs were correctly inferred as excitatory or inhibitory, respectively. The points in the second and fourth quadrants are existing connections whose signs were misclassified.

The number of nonzero connections increases with the recording duration. Existing connections with large PSP amplitude tend to be detected with the signs correctly identified (points in the first and third quadrants). There are also virtual connections assigned for unconnected pairs (nonzero \(y\)-axis). The number of such FPs is larger than the expected number of statistical errors (Fig. 2a). This implies that the false connections may not be mere statistical fluctuations, but rather that they may reflect the functional connectivity indirectly connected via other unobserved neurons.

Figure 3c demonstrates the way individual connections emerge by increasing the recording duration. Here the abscissa is the observation window (\(T\)) multiplied by the firing rates of the pre and postsynaptic neurons (\({\lambda }_{{\rm{pre}}}\) and \({\lambda }_{{\rm{post}}}\)) so that all data are organized into a unified formula (inequality (3)). The values of \(T{\lambda }_{{\rm{pre}}}{\lambda }_{{\rm{post}}}\) for the excitatory connections tended to be smaller than those of inhibitory connections, because the firing rates of excitatory neurons were typically lower than those of inhibitory neurons.

### Excitatory–inhibitory (E–I) dominance index

The probability of misassigning individual connectivity for unconnected pairs tends to be higher than the statistical significance level, because their firing is generally correlated with each other due to indirect interactions through unobserved neurons. Nevertheless, excitatory and inhibitory characteristics of individual neurons can be inferred with a lower error rate, because we can refer to multiple connections for each neuron.

We define an excitatory–inhibitory (E–I) dominance index as

where \({n}_{{\rm{e}}}\) and \({n}_{{\rm{i}}}\) represent the numbers of identified excitatory and inhibitory connections projecting from each neuron, respectively. The E–I dominance indexes computed for 2 networks of 80 neurons each are plotted against firing rates of neurons (Fig. 4a). In this case, excitatory and inhibitory characteristics of individual neurons were well-identified based on E–I dominance indexes. Inhibitory neurons typically exhibited higher firing rates in comparison to excitatory neurons. The firing irregularity measured using the local variation (\(Lv\)) of interspike intervals^{46,47} is plotted against firing rate. Spiking of inhibitory neurons tended to be more regular (smaller \(Lv\)) than that of excitatory neurons.

If we can record many spike trains in parallel for a long time, many excitatory and inhibitory neurons may be correctly identified according to \({d}_{{\rm{ei}}}\,> \,0\) and \({d}_{{\rm{ei}}}\,<\,0\), respectively. Figure 4b illustrates the manner in which the ratio of such correct identification depends on the total number of spike trains and the duration of observation.

### Real spike trains

We apply our method to spike trains recorded from the hippocampal CA1 area of a rat while it was exploring a square open field (hc-3 data sets in Collaborative Research in Computational Neuroscience (CRCNS))^{48}. Figure 5a displays the connections obtained with different observation time windows, demonstrating that more connections become visible as the recording duration increases, similar to the results seen with synthetic data. The connection matrix is divided into four quadrants according to the putative classification performed by manually analyzing waveforms, autocorrelograms, and mean firing rates^{49,50,51}. We observe that connections in the third, fourth, and first quadrants of the connectivity matrix representing excitatory–inhibitory, and inhibitory–inhibitory, and inhibitory–excitatory zones, respectively, are detected in a relatively short observation window. This is consistent with our formula (3), given that inhibitory neurons typically fire at high rates, though inhibitory neurons are not necessarily a uniform population^{52}. Connections in the second quadrant, representing the excitatory–excitatory zone, only appear after increasing the observation time window, and the estimated connection pattern remains sparse; more connections might have been identified if the observation period had been even longer. However, the estimated connection pattern is consistent with the finding using intracellular recording in vitro that inter-pyramidal connections in the hippocampus CA1 are sparse^{53}.

Figure 5b shows CCs of several neuron pairs (see Supplementary Fig. 3 for all the detected pairs). Here, we have excluded spike records at an interval of \(\pm 1\) ms in the cross-correlogram, because near-synchronous spikes were not detected in the experiment due to the shadowing effect^{54}. The CCs become less noisy as the observation time increases, and some connections resolved (8–7, 13–3, 14–7, and 15–8). Some real spike trains exhibited large-scale wavy fluctuations (13–11), which may suggest that these neurons are under the influence of brain activity with lagged phases or perhaps they were responding to some unidentified external stimulus. Our method absorbs these fluctuations by adapting the slow part of the GLM (demonstrated as light green lines in Fig. 5b), and succeeds in detecting a tiny impact by fitting coupling filters (lines colored magenta, cyan, and gray, respectively represent excitatory, inhibitory, and undetermined connections in Fig. 5b).

In Fig. 5c, we plotted the E–I dominance index (\({d}_{{\rm{ei}}}\)) and the firing irregularity (\(Lv\)) against the firing rate. The E–I dominance index is roughly consistent with the putative excitatory and inhibitory neurons. The irregularity of the putative excitatory neurons tended to be higher (larger \(Lv\)) than that of the inhibitory neurons, similar to what we observed with the simulation data. The good separation of the putative excitatory and inhibitory neurons in these plots implies that we can classify recorded cells into excitatory and inhibitory neurons reliably without having to rely on their waveforms, as the E–I dominance index, firing irregularity, and firing rate are obtained solely from the spike times.

We also attempted to analyze a set of spike trains recorded simultaneously from multiple regions including CA1 and the Entorhinal Cortex (EC). Figure 5d demonstrates a matrix of estimated connections among excitatory and inhibitory neurons in CA1 and EC. Though the number of inter-regional connections was small in this sample data, our analysis method is generally applicable to any set of spike trains, irrespective of the recorded areas.

### Comparison with other methods

We compared our method with the conventional CC method^{16} and the jittering method^{25} by applying these methods to synthetic and biological data. With the synthetic data, we can compare the performance of inferring connectivity with the true connectivity (Fig. 6a). Here, we have not shown excitatory connections smaller than 1 mV in the true connectivity matrix as in Fig. 2a, because they are unlikely to be detected in a 90 min recording. The relative performance of the analysis methods is unchanged even if the smaller EPSPs are included. The conventional CC analysis tended to produce a number of FPs, revealing a vulnerability to fluctuations in cross-correlograms. In contrast, the jittering method avoided making FPs, but missed many existing connections, in particular for inhibitory connections. This result may have occurred because the decrease in the firing rate induced by an inhibitory interaction is slower than an impulsive response to an excitatory stimulus; the jittering method count spikes in each bin and tends to overlook a slower modulation in the firing rate. The number of false connections was 88, 27, and 13, respectively for the conventional CC method, the jittering method, and the GLMCC method, indicating the superiority of the present method. We also examined the manner in which the number of errors varies with the firing rate of neurons, and found that the estimation error increases with the firing rates (Supplementary Fig. 4).

We also compared the connections estimated from the real biological data recorded from the hippocampus of a rat (Fig. 6b). The conventional CC method and jittering method suggested many (false) excitatory connections from putative inhibitory neurons to other neurons. In the GLMCC, most of detected inhibitory connections in hippocampal data are from inhibitory to inhibitory or from inhibitory to excitatory neurons, consistent with low FPs and FNs in inhibitory connections in synthetic data by this method.

### Testing with large-scale simulations

We have tuned the GLMCC method using synthetic data of a network of 1000 HH neurons and assessed the estimation performance. We have also tested the method with simulation data of different inhibitory connectivities and those generated by LIF neurons^{29}, and confirmed that the method estimates the connectivity accurately for these data as well (Supplementary Fig. 5). In the original simulation, 1000 HH neurons are densely connected with excitatory neurons innervating EPSPs to 12.5% of other neurons. However, the effective connectivity is rather sparse, because the EPSPs are log normally distributed and the majority of them are weak. Accordingly, the number of effective connections each neuron receives is not large in this network size.

Considering the realistic situation in which each neuron is receiving strong connections from a number of neurons, we carried out simulations of a larger scale network consisting of 10,000 LIF neurons using the NEST simulator^{55} (Supplementary Note 1 and Supplementary Tables 1, 2, and 3). By performing simulations of different connection densities, we examined the manner in which the number of false estimates varies with the number of connections. Figure 7 demonstrates the proportions of FPs and FNs counted for each pair of neurons, indicating the stable estimation of the GLMCC method and its superiority to other existing methods. Sample connectivity matrices are presented in Supplementary Fig. 6.

## Discussion

We have presented a method for reconstructing neuronal circuitry from multichannel extracellular neuronal recordings. This method, based on a combination of the GLM and CC, can balance the antagonistic demands for reducing FPs and FNs when estimating neuronal connectivity. Our method is tolerant of the large variations in firing activity that often occur in vivo. As a critical part of the method, we show a framework for estimating the necessary duration of the spike recordings so that any likely neuronal connections are detected. The duration is presented in terms of the firing rates of the pre- and postsynaptic neurons, and the presumed PSP.

It would be ideal to be able to estimate individual connections using intracellular or patch clamp recordings where the postsynaptic current caused by presynaptic neuronal firing can be measured, as is done with recordings from the rat cortex^{34,37}. While those methods can reliably detect synaptic connections, they are limited because only a few neurons can be recorded simultaneously.

With the recent increase in parallel high channel count extracellular recordings from anaesthetized and behaving animal subjects^{1,2}, it is possible to estimate the connection strength between a number of neurons^{20,28}. Several strong analytical methods for estimating connections from spike trains have been developed, including the CC analysis^{14,15,18,21} and the GLM^{8,19,23,27,29}. While CCs have been used to estimate neuronal connectivity, this classical CC analysis becomes unreliable when there are large fluctuations in the data. One approach to solving this problem has been to jitter the time stamps of spikes^{25,28}. We tested the performance of the conventional CC method and the jittering method in estimating connectivity using synthetic data, and found that our GLMCC performed better than conventional methods (Fig. 6).

Another approach has been to apply GLM to parallel spike trains. However, the size of the computation increases as the recording time increases. Because the number of neuronal pairs increases by the square of the number of spike trains (e.g., 10,000 pairs should be examined for 100 parallel spike trains), computation for estimating individual connections of each pair should be modest. Our analysis can be conducted with a reasonable computation time with amounts of data that can reasonably be collected, as our GLM analyses the CC for a time window of 100 ms rather than the entire spike trains. Our GLMCC may also adapt to wavy fluctuations in CC, making it tolerant to large-scale fluctuations that are often attendant on real spike trains in vivo (cf. Fig. 5b). There could also be fluctuations on an even longer time scale. There are several methods for processing such nonstationarity, including the state-space models^{56,57} or the Gaussian process^{58}. Such slow fluctuations may induce variation in the CC amplitude, but they would not appear in the averaged cross-correlogram \(c(t)\) in an interval of 100 ms in our framework.

In general, biological data are accompanied with large nonstationary fluctuations; neuronal firing rate may change according to behavioral contexts, and it might even occur that each neuron may appear or disappear due to unstable recording. To examine whether our method may have provided consistent estimation for neuronal connections, we split the recordings in half and compared estimated connections from each half (Supplementary Fig. 7). We found that the estimated connections exhibited significant overlap between the first and second halves. Thus, our inference method provides consistent data, not only for synthetic data, but also for experimental data. It is interesting to test our estimation with the information of biological connectivity, which is obtained by the latest experimental techniques such as the intracellular current injection^{59} or optogenetic control^{60}.

Because recording time is limited, a possible restriction on inferring connectivity could be that there is not enough data. Here we made estimates on the duration of spike recordings needed so that any likely neuronal connections would be detected (cf. Table 1). It should be noted that the limit given in Eq. (3) or Table 1 is not due to a limitation of our method, but it is an essential limitation caused by the sparse firing itself. Even if a given neuron fired several times with each spike occurring shortly after the firing of an index neuron, such evidence may not be sufficient to confirm the presence of a synaptic connection. Thus, enough data are needed so that spike co-occurrence becomes statistically significant^{28}.

When we applied our method to data recorded from the rat hippocampus we identified connections for four types of pairs including excitatory–excitatory, excitatory–inhibitory, inhibitory–inhibitory, and inhibitory–excitatory. These numbers were consistent with those identified physiologically^{51}, supporting the efficacy of our method. Typically, the pyramidal neurons have low background firing rates and interneurons have higher firing rates. Our analysis (cf. inequality (3)) indicates that the necessary recording duration is inversely proportional to the product of the firing rates of the pre and postsynaptic neurons. Thus, connections between neurons firing at high frequencies can be detected with a relatively short observation duration. In contrast, for neurons with low firing rates, data will have to be collected for much longer periods, and we expect that excitatory–excitatory connections will be detected only if there is a relatively long recording period. The consequences of this have been seen with experimental data; for instance, synapses that connect with inhibitory interneurons were frequently detected, and connections between excitatory neurons were rarely detected^{20}^{,61}. The hippocampal data analyzed in this study (Fig. 5a) conforms to this pattern, and our analysis provides insight into how this happens. Our approach and method provide a means for estimating a map of neuronal connections from high channel count simultaneous recordings. We presume, based on anatomical differences, that these maps will have different structures in different functional brain regions. Having a reliable technique for estimating the maps offers the opportunity to identify these different structures, thereby providing a basis for understanding the variations in information processing that arises from differences in anatomy and connected structures.

## Methods

### Estimating neuronal connectivity

Here we describe our GLM analysis, the basis of validating connections and selecting the significance level, and the method of estimating the PSP.

#### GLMCC

To discover neuronal connections between a pair of neurons, we devise a GLM that detects short-term synaptic impacts in the CC (as schematically depicted in Fig. 1a and as real cross-correlograms of rat hippocampal data in Fig. 5b). We designed the GLMCC as

where \(t\) is the time from the spikes of the reference neuron, and \(a(t)\) represents large-scale fluctuations produced outside of the pair of neurons. \({J}_{ij}\) represents neuronal connection from the \(j\)th neuron to the \(i\)th neuron. The time profile of the synaptic interaction is modeled as \(f(t)\,=\,\exp (-\frac{t\,-\,d}{\tau })\) for \(t\ > \ d\) and \(f(t)\,=\,0\) otherwise, where \(\tau\) is the typical time scale of synaptic impact and \(d\) is the transmission delay. The connection parameter \({J}_{ij}\) of our GLMCC can be derived from a model of the original interaction process between neurons (Supplementary Note 2).

Given an underlying rate \(c(t)\), the probability for spikes to occur at \(\{{t}_{k}\}\,=\,\{{t}_{1},{t}_{2},\cdots \ ,{t}_{N}\}\) is obtained theoretically as^{62},

where \(\theta \,=\,\{{J}_{12},{J}_{21},a(t)\}\), representing a set of parameters that characterize \(c(t)\).

To detect short-term synaptic impacts of a few ms hidden in large-scale fluctuations in the CC, we make \(a(t)\) adapt to the slow part of the fluctuations. This may be done by providing a prior distribution that penalizes a large gradient of \(a(t)\):

where \(\gamma\) is a hyperparameter representing the flatness of \(a(t)\); \(a(t)\) is nearly constant if \(\gamma\) is small, or is otherwise rapidly fluctuating. We selected the hyperparameter using the ABIC (Akaike Bayesian Information Criterion)^{63} so that c(t) fits the experimental CCs, and adopted the mean value: \(\gamma \,=\,5\times 1{0}^{-4}\) [ms\({}^{-1}\)]. For the connection parameters \({J}_{12}\) and \({J}_{21}\), we have assumed uniform priors.

The posterior distribution of a set of parameters \(\theta \,=\,\{{J}_{12},{J}_{21},a(t)\}\), given the spike data \(\{{t}_{k}\}\), is obtained from Bayes’ rule as

The parameters are determined with the maximum *a posteriori* (MAP) estimate, that is, by maximizing the posterior distribution or its logarithm:

The MAP inference for \(\theta \,=\,\{{J}_{12},{J}_{21},a(t)\}\) was performed efficiently using the Levenberg–Marquardt method (Supplementary Note 3).

#### Statistical test for determining connectivity

We determine the presence of a neuronal connection by disproving the null hypothesis that a connection is absent. Namely, we conclude that a connection is likely present if the estimated parameter is outside the confidence interval for the null hypothesis; otherwise, the presence of a connection is undetermined. The null hypothesis is that two neurons generate spikes at their baseline firing rates independently of each other. According to Poisson statistics, the variance of the number of spikes generated in a time interval \(\Delta\) after the spike of a reference neuron is equal to its mean. The mean spike number is obtained by multiplying the intensity \(c(0)\) by an interval \(\Delta\),

Assuming that the connection \(J\) is small, the average number of spikes caused by a neuronal connection during an interval \(\Delta\) is approximated as

The condition that the synaptic interaction produces a significant impact on the CC is \(|\delta n|> {z}_{\alpha }\sqrt{n}\), where \({z}_{\alpha }\) is a threshold for the normal distribution (\({z}_{\alpha }=2.58\) for \(\alpha =0.01\) and \({z}_{\alpha }=3.29\) for \(\alpha =0.001\)). In terms of the estimated connection parameter \(\hat{J}\), this condition is given as

Here, \({\Delta }^{1/2}/(\tau (1-{e}^{-\Delta /\tau }))\) on the right-hand side of this inequality is dependent on \(\Delta\) but it takes the lowest value \(1.57{\tau }^{-1/2}\) at \(\Delta =1.26\tau\). Thus we have the following inequality:

The typical duration of spike recording needed for the connectivity inference (inequality (3)) is obtained from Eq. (14) by approximating \(c(0)=T{\lambda }_{{\rm{pre}}}{\lambda }_{{\rm{post}}}\), where \(T\) is the total duration of recording.

Another requirement is that spike trains should contain a sufficiently large number of spikes to make a reliable inference. A typical number of spikes contained in the CC in the interaction time window is \(T{\lambda }_{{\rm{pre}}}{\lambda }_{{\rm{post}}}\tau\). By requiring this to be >10, we obtain the inequality (4).

#### Selecting the significance level

Although we obtained the confidence interval of the connection parameter \({J}_{ij}\) at the given value above, the probability of assigning spurious connectivity to anatomically disconnected pairs is higher than the threshold, because spike trains are correlated. Such spurious connections or FPs may be reduced by decreasing the significance level. However, this operation may cause the vast majority of existing connections to be missed, thus producing a huge number of FNs. Thus, the significance level should be chosen so that these conflicting demands (of reducing FPs and FNs) are optimally balanced.

As we can directly count FPs and FNs in simulation data, we may select a significance level such that the performance of the inference is maximized. As a measure for assessing the performance of connectivity inference, we adopt the MCC^{40} defined as

where \({N}_{{\rm{T}}{\rm{P}}}\), \({N}_{{\rm{T}}{\rm{N}}}\), \({N}_{{\rm{F}}{\rm{P}}}\), and \({N}_{{\rm{F}}{\rm{N}}}\) represent the numbers of true positive, true negative, FP, and FN connections, respectively.

Because there are excitatory and inhibitory connections, we may obtain two coefficients for individual categories. To evaluate the quality of inference in terms of a single measure, here we take the macro-average MCC that gives equal importance to these categories (Macro-average)^{64}:

In computing the coefficient for the excitatory category \(MC{C}_{E}\), we classify connections as excitatory or other (disconnected and inhibitory); for the inhibitory category \(MC{C}_{I}\), we classify connections as inhibitory or other (disconnected and excitatory). Here we evaluate \(MC{C}_{E}\) by considering only excitatory connections of reasonable strength (EPSP \(\,> \,\) 1 mV), as EPSPs distribute log-normally and there are a number of weak connections that are hard to detect in several hours.

#### Estimating PSPs from GLM connection parameters

We translate the GLM connection parameters \({J}_{ij}\) into biological PSPs \({w}_{ij}\) mV. This relation is obtained by numerically simulating a network of neurons interacting through known connections \(\{{w}_{ij}\}\) and by applying the GLM to their spike trains to estimate the connection parameters \(\{{J}_{ij}\}\). Regarding synaptic connections \({w}_{ij}\) for which \({J}_{ij}\) was verified in the correct signs, we assume a proportional relation as in Eq. (2):

The coefficient \(a\) is determined by applying regression analysis to the synthetic data. We obtained \(a=0.39\) for EPSP and \(1.57\) for IPSP, respectively.

When we newly estimate connection parameters \({\hat{J}}_{ij}\) from spike trains, they can be translated into PSPs using the relation:

Figure 3b compares the estimated PSPs \({\hat{w}}_{ij}\) with the original PSPs values \({w}_{ij}\) of a model neural network.

In our numerical simulation, synaptic connectivity is given in terms of conductance. Thus we have to translate conductance into PSP. The translation rule is described in Supplementary Note 4 and Supplementary Fig. 8.

### Details of existing methods

Here we describe the details of the conventional CC method and the jittering method, which were compared with the present GLMCC method in estimating synaptic connectivity.

The CC method estimates the deviation in the cross-correlogram at short time-lags^{16}. The synaptic connection is detected if the spike count is outside the confidence interval for a null hypothesis that two spike trains are independent stationary Poisson processes. The cross-correlogram was constructed by counting the number of spikes in an interval [−50, \(+\)50] ms with a bin size of \(\Delta =1\) ms. The confidence interval is given by \([{\bar{n}}_{{\rm{cc}}}-{z}_{\alpha }\sqrt{{\bar{n}}_{{\rm{cc}}}},{\bar{n}}_{{\rm{cc}}}+{z}_{\alpha }\sqrt{{\bar{n}}_{{\rm{cc}}}}]\), where \({\bar{n}}_{{\rm{cc}}}={\lambda }_{{\rm{pre}}}{\lambda }_{{\rm{post}}}T\Delta\) is the expected number of spikes; \({\lambda }_{{\rm{pre}}}\) and \({\lambda }_{{\rm{post}}}\) are the firing rates of the pre- and postsynaptic neurons, respectively; and \({z}_{\alpha }\) is the threshold for the normal distribution. We have chosen the significance level \(\alpha =0.01\).

The jittering method was introduced to avoid false detection caused by large fluctuations in the background cross-correlogram^{20,25}. Here we adopted the parameters in the original method. Namely, we generated surrogate data sets by randomly perturbing or jittering the original data in a uniform interval of [−5,\(+\)5] ms to estimate a global band at an acceptance level of 99%. An excitatory or inhibitory monosynaptic connections was identified if the original cross-correlogram at a bin size of 1 ms protruded the band anywhere in the region [1, 4] ms.

### A network of HH-type neurons

We ran a numerical simulation of a network of 1000 HH-type neurons interacting through fixed synapses. Of them, 800 excitatory neurons innervate to 12.5% of other neurons with EPSPs that are log-normally distributed^{34,35,37}, whereas 200 inhibitory neurons innervate randomly to 25% of other neurons with IPSPs that are normally distributed. Simulated spike trains and the connectivity matrix (EPSPs and IPSPs) are available on figshare^{65}.

#### Neuron models

For excitatory pyramidal cells, we adopted HH-type models developed by Destexhe et al.^{66}. The membrane potential \(V\) obeys the equation:

where \({C}_{{\rm{m}}}^{{\rm{pyr}}}\) is the membrane capacitance, \({I}_{{\rm{L}}}={g}_{{\rm{L}}}^{{\rm{pyr}}}(V-{E}_{{\rm{L}}}^{{\rm{pyr}}})\) is the leak current, \({I}_{{\rm{Na}}}\,=\,{g}_{{\rm{Na}}}^{{\rm{pyr}}}{m}^{3}h(V\,-\,{E}_{{\rm{Na}}}^{{\rm{pyr}}})\) is the Na\({}^{+}\) current, \({I}_{{\rm{K}}}\,=\,{g}_{{\rm{K}}}^{{\rm{pyr}}}{n}^{4}(V\,-\,{E}_{{\rm{K}}}^{{\rm{pyr}}})\) is the delayed-rectifier K\({}^{+}\) current, \({I}_{{\rm{M}}}\,=\,{g}_{{\rm{M}}}^{{\rm{pyr}}}p(V\,-\,{E}_{{\rm{K}}}^{{\rm{pyr}}})\) is the muscarinic potassium current, and \({I}_{{\rm{tot}}}\) is the total input current from the other neurons. The gating variables \(x\in \{m,h,n,p\}\) are described by the kinetic equation:

where \({\alpha }_{x}\) and \({\beta }_{x}\) are the activation and inactivation functions, respectively. The activation and inactivation functions and the parameter values are summarized in Table 2.

For inhibitory interneurons, we adopted the HH-type models developed by Erisir et al.^{67}. The membrane potential \(V\) obeys the equation:

where \({C}_{{\rm{m}}}^{{\rm{inh}}}\) is the membrane capacitance, \({I}_{{\rm{L}}}\,=\,{g}_{{\rm{L}}}^{{\rm{inh}}}(V\,-\,{E}_{{\rm{L}}}^{{\rm{inh}}})\) is the leak current, \({I}_{{\rm{Na}}}\,=\,{g}_{{\rm{Na}}}^{{\rm{inh}}}{m}^{3}h(V\,-\,{E}_{{\rm{Na}}}^{{\rm{inh}}})\) is the Na\({}^{+}\) current, \({I}_{{{\rm{K}}}_{1}}\,=\,{g}_{{{\rm{K}}}_{1}}^{{\rm{inh}}}{n}_{1}^{4}(V\,-\,{E}_{{\rm{K}}}^{{\rm{inh}}})\) and \({I}_{{{\rm{K}}}_{2}}={g}_{{{\rm{K}}}_{2}}^{{\rm{inh}}}{n}_{2}^{2}(V-{E}_{{\rm{K}}}^{{\rm{inh}}})\) are the delayed-rectifier K\({}^{+}\) current due to Kv1.3 and Kv3.1–Kv3.2 conductance, respectively, and \({I}_{{\rm{tot}}}\) is the total input current. The gating variables \(x\in \{m,h,{n}_{1},{n}_{2}\}\) follow the kinetic equation (18), with the activation and inactivation functions prescribed by the original paper^{67}. The parameter values are summarized in Table 2.

#### Synaptic connections

Each neuron receives synaptic currents induced by the firing of other neurons. Excitatory synaptic currents are mediated by 2-amino-3-(5-methyl-3-oxo-1,2-oxazol-4-yl) propanoic acid (AMPA) and N-methyl-D-aspartate (NMDA) receptors, whereas inhibitory synaptic currents are mediated by \(\gamma\)-aminobutyric acid (GABA)-A receptors. The total input current to the \(i\)th neuron is given by

where \({I}_{{\rm{AMPA}}}^{ij}\), \({I}_{{\rm{NMDA}}}^{ij}\), and \({I}_{{\rm{GABA}}}^{ij}\), respectively represent the synaptic currents given by the AMPA, NMDA, and GABA receptors, and \({I}_{{\rm{bg}}}\) represents the background current.

For AMPA-mediated current, we adopted the depressing synapse model proposed by Tsodyks et al.^{44}

where \({g}_{{\rm{AMPA}}}^{ij}\) is the maximal AMPA conductance, \({V}_{i}\) is the membrane potential of the postsynaptic neuron, \({t}_{k}^{j}\) is the \(k\)th spike time of the presynaptic neuron, and \({d}_{{\rm{AMPA}}}\) is the synaptic conduction delay. For each connection, the conduction delay is drawn from a uniform distribution between 0 and 2 ms. \({w}_{j}\) and \({r}_{j}\) represent the fraction of synaptic resources in the effective and recovered states, respectively. The AMPA parameter values are summarized in Table 3.

For NMDA-mediated current, we adopted the first-order kinetic equation proposed by Destexhe et al.^{68}

where [Mg\({}^{2+}\)] = 1.0 mM is the extracellular magnesium concentration, \({t}_{{\rm{pre}}}\) is the last spike time of the presynaptic neuron, \({d}_{{\rm{NMDA}}}\) is the conduction delay drawn from a uniform distribution between 0 and 2 ms, and \(T(t)\) represents the transmitter concentration in the cleft. When a spike occurs in a presynaptic neuron, a transmitter pulse is induced such that \(T(t)\,=\,1\) mM for a short period (1 ms) and the concentration returns to \(T(t)\,=\,0\). The NMDA parameter values are summarized in Table 3.

For GABA-A-mediated current, we adopted the depressing synapse model proposed by Tsodyks et al.^{44}

where \({d}_{{\rm{GABA}}}\) is the conduction delay drawn from a uniform distribution between 1 and 3 ms. The GABA parameter values are summarized in Table 3.

We ran a simulation of a network consisting of 800 pyramidal neurons and 200 interneurons interconnected with a fixed strength. Each neuron receives 100 excitatory inputs randomly selected from 800 pyramidal neurons and 50 inhibitory inputs selected from 200 interneurons.

The AMPA conductance (\({g}_{{\rm{AMPA}}}^{ij}\)) is drawn independently from a log-normal distribution^{34,35}

where \(\mu \,=\,-3.37\) and \(\sigma \,=\,1.3\) are the mean and SD of the natural logarithm of the AMPA conductance. The NMDA and GABA conductances (\({g}_{{\rm{NMDA}}}^{ij}\) and \({g}_{{\rm{GABA}}}^{ij}\)) are sampled from the normal distribution

where \(\mu\) and \(\sigma\) are the mean and SD of the conductances. Parameters are \({\mu }_{{\rm{NMDA}}}\,=\,8.5\times 1{0}^{-4}\ {\rm{mS}}\ {{\rm{cm}}}^{-2}\), \({\sigma }_{{\rm{NMDA}}}\,=\,8.5\times 1{0}^{-5}\ {\rm{mS}}\ {{\rm{cm}}}^{-2}\) and \({\mu }_{{\rm{GABA}}}\,=\,0.34\ {\rm{mS}}\ {{\rm{cm}}}^{-2}\), \({\sigma }_{{\rm{GABA}}}\,=\,0.27\ {\rm{mS}}\ {{\rm{cm}}}^{-2}\) for the NMDA and GABA conductance, respectively. If the sampled value is less than zero, the conductance is resampled from the same distribution.

Because our model network is smaller than real cortical networks, where each neuron receives inputs from the order of 1000 neurons, we added a background current to represent inputs from many neurons, as previously done by Destexhe et al.^{69}. The background current is given as the sum of excitatory and inhibitory inputs:

where the total excitatory and inhibitory conductance \({g}_{{\rm{e,i}}}(t)\) obey the Ornstein–Uhlenbeck process^{70}, representing random bombardments from a number of neurons.

where \(x\) represents excitatory (e) or inhibitory (i), \({g}_{x,0}\) and \({\sigma }_{x}\) are the asymptotic mean and SD of the conductance, \({\tau }_{x}\) is the synaptic time constant, and \(\xi (t)\) is the Gaussian white noise with zero mean and unit variance. Parameters for the background inputs are summarized in Table 3.

Simulation codes were written in C++ and parallelized with OpenMP framework. Simulations were conducted on a computer with Intel Xeon Processors E5-2650v2. The time step was 0.01 ms for excitatory (pyramidal) neurons and 0.001 ms for inhibitory (inter) neurons. The neural activity was simulated up to 10,000 s.

### Experimental data

Spike trains were recorded from the hippocampal area of a rat, while it was exploring an open square field. Experimental procedures, data collection, and spike sorting are as described in detail in Mizuseki et al.^{51}. All protocols were approved by the Institutional Animal Care and Use Committees of Rutgers University and New York University. Hippocampal principal cells and interneurons were separated on the basis of their waveforms, autocorrelograms, and mean firing rates^{49,50,51}.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

The source data underlying Figs. 2–7 are provided as a Source Data file. Simulated data generated by a network of 1,000 Hodgkin–Huxley neurons has been deposited in figshare^{65} (https://doi.org/10.6084/m9.figshare.9637904). All experimental data used in this paper can be found in hc-3 data sets at CRCNS^{48} (CRCNS.org. https://doi.org/10.6080/K09G5JRZ).

## Code availability

A ready-to-use version of the web application, the source code, and example data sets are available at our website, http://www.ton.scphys.kyoto-u.ac.jp/%7Eshino/GLMCC and are also hosted publicly on github, accessible via https://github.com/NII-Kobayashi. Simulation codes of a large network of LIF neurons are available on ModelDB (https://senselab.med.yale.edu/modeldb/ShowModel.cshtml?model=258807). Simulation codes of a network of HH neurons are available upon request from the corresponding author.

## References

- 1.
Buzsáki, G. Large-scale recording of neuronal ensembles.

*Nat. Neurosci.***7**, 446 (2004). - 2.
Jun, J. J. et al. Fully integrated silicon probes for high-density recording of neural activity.

*Nature***551**, 232 (2017). - 3.
Mitz, A. R. et al. High channel count single-unit recordings from nonhuman primate frontal cortex.

*J. Neurosci. Methods***289**, 39–47 (2017). - 4.
Pachitariu, M. et al. Suite2p: beyond 10,000 neurons with standard two-photon microscopy. preprint at https://www.biorxiv.org/content/10.1101/061507v2.abstract (2017).

- 5.
Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M. & Harris, K. D. High-dimensional geometry of population responses in visual cortex.

*Nature***571**, 361–365 (2019). - 6.
Brown, E. N., Kass, R. E. & Mitra, P. P. Multiple neural spike train data analysis: state-of-the-art and future challenges.

*Nat. Neurosci.***7**, 456 (2004). - 7.
Hatsopoulos, N., Joshi, J. & O’Leary, J. G. Decoding continuous and discrete motor behaviors using motor and premotor cortical ensembles.

*J. Neurophysiol.***92**, 1165–1174 (2004). - 8.
Pillow, J. W. et al. Spatio-temporal correlations and visual signalling in a complete neuronal population.

*Nature***454**, 995 (2008). - 9.
Ohiorhenuan, I. E. et al. Sparse coding and high-order correlations in fine-scale cortical networks.

*Nature***466**, 617 (2010). - 10.
Stevenson, I. H. & Kording, K. P. How advances in neural recording affect data analysis.

*Nat. Neurosci.***14**, 139 (2011). - 11.
Churchland, M. M. et al. Neural population dynamics during reaching.

*Nature***487**, 51 (2012). - 12.
Cunningham, J. P. & Byron, M. Y. Dimensionality reduction for large-scale neural recordings.

*Nat. Neurosci.***17**, 1500 (2014). - 13.
Kobak, D. et al. Demixed principal component analysis of neural population data.

*Elife***5**, e10989 (2016). - 14.
Perkel, D. H., Gerstein, G. L. & Moore, G. P. Neuronal spike trains and stochastic point processes: Ii. simultaneous spike trains.

*Biophys. J.***7**, 419–440 (1967). - 15.
Toyama, K., Kimura, M. & Tanaka, K. Organization of cat visual cortex as investigated by cross-correlation technique.

*J. Neurophysiol.***46**, 202–214 (1981). - 16.
Aertsen, A. M. & Gerstein, G. L. Evaluation of neuronal connectivity: sensitivity of cross-correlation.

*Brain Res.***340**, 341–354 (1985). - 17.
Reid, R. C. & Alonso, J.-M. Specificity of monosynaptic connections from thalamus to visual cortex.

*Nature***378**, 281 (1995). - 18.
Sakurai, Y. Hippocampal and neocortical cell assemblies encode memory processes for different types of stimuli in the rat.

*J. Neurosci.***16**, 2809–2819 (1996). - 19.
Okatan, M., Wilson, M. A. & Brown, E. N. Analyzing functional connectivity using a network likelihood model of ensemble neural spiking activity.

*Neural Comput.***17**, 1927–1961 (2005). - 20.
Fujisawa, S., Amarasingham, A., Harrison, M. T. & Buzsáki, G. Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex.

*Nat. Neurosci.***11**, 823 (2008). - 21.
Grun, S. Data-driven significance estimation for precise spike correlation.

*J. Neurophysiol.***101**, 1126–1140 (2009). - 22.
Stevenson, I. H. et al. Bayesian inference of functional connectivity and network structure from spikes.

*IEEE Trans. Neural Syst. Rehabil. Eng.***17**, 203–213 (2009). - 23.
Chen, Z., Putrino, D. F., Ghosh, S., Barbieri, R. & Brown, E. N. Statistical inference for assessing functional connectivity of neuronal ensembles with sparse spiking data.

*IEEE Trans. Neural Syst. Rehabil. Eng.***19**, 121–135 (2011). - 24.
Ito, S. et al. Extending transfer entropy improves identification of effective connectivity in a spiking cortical network model.

*PLoS One***6**, e27431 (2011). - 25.
Amarasingham, A., Harrison, M. T., Hatsopoulos, N. G. & Geman, S. Conditional modeling and the jitter method of spike resampling.

*J. Neurophysiol.***107**, 517–531 (2012). - 26.
Stetter, O., Battaglia, D., Soriano, J. & Geisel, T. Model-free reconstruction of excitatory neuronal connectivity from calcium imaging signals.

*PLoS Comput. Biol.***8**, e1002653 (2012). - 27.
Kobayashi, R. & Kitano, K. Impact of network topology on inference of synaptic connectivity from multi-neuronal spike data simulated by a large-scale cortical network model.

*J. Comput. Neurosci.***35**, 109–124 (2013). - 28.
Schwindel, C. D., Ali, K., McNaughton, B. L. & Tatsuno, M. Long-term recordings improve the detection of weak excitatory-excitatory connections in rat prefrontal cortex.

*J. Neurosci.***34**, 5454–5467 (2014). - 29.
Zaytsev, Y. V., Morrison, A. & Deger, M. Reconstruction of recurrent synaptic connectivity of thousands of neurons from simulated spiking activity.

*J. Comput. Neurosci.***39**, 77–103 (2015). - 30.
Cai, Z., Neveu, C. L., Baxter, D. A., Byrne, J. H. & Aazhang, B. Inferring neuronal network functional connectivity with directed information.

*J. Neurophysiol.***118**, 1055–1069 (2017). - 31.
Brody, C. D. Correlations without synchrony.

*Neural Comput.***11**, 1537–1551 (1999). - 32.
Gerstein, G. L., Bedenbaugh, P. & Aertsen, A. M. Neuronal assemblies.

*IEEE Trans. Biomed. Eng.***36**, 4–14 (1989). - 33.
Stevenson, I. H., Rebesco, J. M., Miller, L. E. & Körding, K. P. Inferring functional connections between neurons.

*Curr. Opin. Neurobiol.***18**, 582–588 (2008). - 34.
Song, S., Sjöström, P. J., Reigl, M., Nelson, S. & Chklovskii, D. B. Highly nonrandom features of synaptic connectivity in local cortical circuits.

*PLoS Biol.***3**, e68 (2005). - 35.
Teramae, J.-N., Tsubo, Y. & Fukai, T. Optimal spike-based communication in excitable networks with strong-sparse and weak-dense links.

*Sci. Rep.***2**, 485 (2012). - 36.
Ikegaya, Y. et al. Interpyramid spike transmission stabilizes the sparseness of recurrent network activity.

*Cereb. Cortex***23**, 293–304 (2013). - 37.
Buzsáki, G. & Mizuseki, K. The log-dynamic brain: how skewed distributions affect network operations.

*Nat. Rev. Neurosci.***15**, 264 (2014). - 38.
Hoffmann, J. H. et al. Synaptic conductance estimates of the connection between local inhibitor interneurons and pyramidal neurons in layer 2/3 of a cortical column.

*Cereb. Cortex***25**, 4415–4429 (2015). - 39.
Potjans, T. C. & Diesmann, M. The cell-type specific cortical microcircuit: relating structure and activity in a full-scale spiking network model.

*Cereb. Cortex***24**, 785–806 (2014). - 40.
Matthews, B. W. Comparison of the predicted and observed secondary structure of t4 phage lysozyme.

*Biochim. Biophys. Acta***405**, 442–451 (1975). - 41.
Fetz, E. E. & Gustafsson, B. Relation between shapes of post-synaptic potentials and changes in firing probability of cat motoneurones.

*J. Physiol.***341**, 387–410 (1983). - 42.
Volgushev, M., Ilin, V. & Stevenson, I. H. Identifying and tracking simulated synaptic inputs from neuronal firing: insights from in vitro experiments.

*PLoS Comput. Biol.***11**, e1004167 (2015). - 43.
Melssen, W. & Epping, W. Detection and estimation of neural connectivity based on crosscorrelation analysis.

*Biol. Cybern.***57**, 403–414 (1987). - 44.
Tsodyks, M. V. & Markram, H. The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability.

*Proc. Natl Acad. Sci. USA***94**, 719–723 (1997). - 45.
Gupta, A., Wang, Y. & Markram, H. Organizing principles for a diversity of gabaergic interneurons and synapses in the neocortex.

*Science***287**, 273–278 (2000). - 46.
Shinomoto, S., Shima, K. & Tanji, J. Differences in spiking patterns among cortical neurons.

*Neural Comput.***15**, 2823–2842 (2003). - 47.
Mochizuki, Y. et al. Similarity in neuronal firing regimes across mammalian species.

*J. Neurosci.***36**, 5736–5747 (2016). - 48.
Mizuseki, K., Sirota, A., Pastalkova, E., Diba, K. & Buzsáki, G. Multiple single unit recordings from different rat hippocampal and entorhinal regions while the animals were performing multiple behavioral tasks. (CRCNS Org, 2013).

- 49.
Skaggs, W. E., McNaughton, B. L., Wilson, M. A. & Barnes, C. A. Theta phase precession in hippocampal neuronal populations and the compression of temporal sequences.

*Hippocampus***6**, 149–172 (1996). - 50.
Csicsvari, J., Hirase, H., Czurko, A. & Buzsáki, G. Reliability and state dependence of pyramidal cell-interneuron synapses in the hippocampus: an ensemble approach in the behaving rat.

*Neuron***21**, 179–189 (1998). - 51.
Mizuseki, K., Sirota, A., Pastalkova, E. & Buzsáki, G. Theta oscillations provide temporal windows for local circuit computation in the entorhinal-hippocampal loop.

*Neuron***64**, 267–280 (2009). - 52.
Freund, T. F. & Buzsáki, G. Interneurons of the hippocampus.

*Hippocampus***6**, 347–470 (1996). - 53.
Deuchars, J. & Thomson, A. Ca1 pyramid-pyramid connections in rat hippocampus in vitro: dual intracellular recordings with biocytin filling.

*Neuroscience***74**, 1009–1018 (1996). - 54.
Pillow, J. W., Shlens, J., Chichilnisky, E. & Simoncelli, E. P. A model-based spike sorting algorithm for removing correlation artifacts in multi-neuron recordings.

*PLoS O*ne**8**, e62123 (2013). - 55.
Gewaltig, M.-O. & Diesmann, M. Nest (neural simulation tool).

*Scholarpedia***2**, 1430 (2007). - 56.
Koyama, S., CastellanosPérez-Bolde, L., Shalizi, C. R. & Kass, R. E. Approximate methods for state-space models.

*J. Amer. Stat. Assoc.***105**, 170–180 (2010). - 57.
Chen, Z. & Brown, E. N. State space model.

*Scholarpedia***8**, 30868 (2013). - 58.
Zhou, B., Moorman, D. E., Behseta, S., Ombao, H. & Shahbaba, B. A dynamic bayesian model for characterizing cross-neuronal interactions during decision-making.

*J. Amer. Stat. Assoc.***111**, 459–471 (2016). - 59.
Marshall, L. et al. Hippocampal pyramidal cell-interneuron spike transmission is frequency dependent and responsible for place modulation of interneuron discharge.

*J. Neurosci.***22**, RC197 (2002). - 60.
English, D. F. et al. Pyramidal cell-interneuron circuit architecture and dynamics in hippocampal networks.

*Neuron***96**, 505–520 (2017). - 61.
Barthó, P. et al. Characterization of neocortical principal cells and interneurons by network interactions and extracellular features.

*J. Neurophysiol.***92**, 600–608 (2004). - 62.
Daley, D. J. & Vere-Jones, D.

*An introduction to the theory of point processes*. (Springer-Verlag, New York, 2003). - 63.
Akaike, H. Likelihood and the bayes procedure. in

*Selected papers of Hirotugu Akaike*309–332 (Springer, 1998). - 64.
Sun A. & Lim E.-P. Hierarchical text classification and evaluation, in Proceedings of ICDM 2001 521–538 (IEEE, 2001).

- 65.
Kobayashi R. et al. Synthetic spike data generated by a network of 1000 hodgkin-huxley type neurons. Figshare (2019) https://doi.org/10.6084/m9.figshare.9637904.

- 66.
Destexhe, A. & Paré, D. Impact of network activity on the integrative properties of neocortical pyramidal neurons in vivo.

*J. Neurophysiol.***81**, 1531–1547 (1999). - 67.
Erisir, A., Lau, D., Rudy, B. & Leonard, C. Function of specific k. channels in sustained high-frequency firing of fast-spiking neocortical interneurons.

*J. Neurophysiol.***82**, 2476–2489 (1999). - 68.
Destexhe, A., Mainen, Z. F. & Sejnowski, T. J. Kinetic models of synaptic transmission.

*Methods Neuronal Model.***2**, 1–25 (1998). - 69.
Destexhe, A., Rudolph, M., Fellous, J.-M. & Sejnowski, T. J. Fluctuating synaptic conductances recreate in vivo-like activity in neocortical neurons.

*Neuroscience***107**, 13–24 (2001). - 70.
Tuckwell, H. C.

*Introduction to theoretical neurobiology:, nonlinear and stochastic theories*2 (Cambridge University Press, Cambridge, 1988).

## Acknowledgements

We thank Yuzuru Yamanaka, Tatsuya Goto, Kazuki Fujita, Daisuke Endo, and Masahiro Naito for their constructive comments on this manuscript. Furthermore, this paper was greatly improved by the comments of anonymous reviewers. R.K. is supported by JSPS KAKENHI grant numbers JP17H03279, JP18K11560, and JP19H01133, JST ACT-I Grant Number JPMJPR16UC, the Okawa Foundation for Information and Telecommunications, and the Open Collaborative Research and MOU grant at the National Institute of Informatics in Japan. S.K. is supported by JST ACT-I Grant Number JPMJPR17U8. A.K. and M.D. received funding from the European Union’s Horizon 2020 Framework Programme for Research and Innovation under Specific Grant Agreement No. 785907 (Human Brain Project SGA2). K.M. is supported by JSPS KAKENHI grant numbers JP16H04656 and JP17K19462. B.J.R. is supported by US NIMH Intramural Program with report number ZIAMH002619-27. S.S. is supported by JSPS KAKENHI Grant numbers JP26280007 and JP17H06028, and the New Energy and Industrial Technology Development Organization (NEDO).

## Author information

### Affiliations

### Contributions

S.S. conceived the project. R.K. and S.S. developed methodology for reconstructing neuronal connectivity. S.K and K.K. performed the network simulation of HH neurons. K.M. performed the experiment. A.K. and M.D. performed the large-scale simulation of LIF neurons. S.S. and B.J.R. wrote the manuscript based on input from R.K. All authors commented on the manuscript. S.S. supervised the project.

### Corresponding author

Correspondence to Shigeru Shinomoto.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

**Peer review information** *Nature Communications* thanks Zhe Chen, Marius Pachitariu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.