Spike sorting for large, dense electrode arrays

Journal name:
Nature Neuroscience
Year published:
Published online


Developments in microfabrication technology have enabled the production of neural electrode arrays with hundreds of closely spaced recording sites, and electrodes with thousands of sites are under development. These probes in principle allow the simultaneous recording of very large numbers of neurons. However, use of this technology requires the development of techniques for decoding the spike times of the recorded neurons from the raw data captured from the probes. Here we present a set of tools to solve this problem, implemented in a suite of practical, user-friendly, open-source software. We validate these methods on data from the cortex, hippocampus and thalamus of rat, mouse, macaque and marmoset, demonstrating error rates as low as 5%.

At a glance


  1. High-count silicon probe recording.
    Figure 1: High-count silicon probe recording.

    (a) Layout of the 32-site electrode array used to collect test data. (b) Short segment of data recorded in rat neocortex with this array. Color of traces indicates recording from the correspondingly colored site in a. Black rectangles highlight action potential waveforms; note the frequent occurrence of temporally overlapping spikes on separate recording channels.

  2. Local spike-detection algorithm.
    Figure 2: Local spike-detection algorithm.

    (a) Adjacency graph for the 32-channel probe. (b) Segment of raw data showing two simultaneous action potentials on spatially separated channels. Scale bars indicate 0.5 mV and 10 samples. (c) High-pass filtered data shown in pseudocolor format (units of s.d.). Vertical lines on the color bar indicate strong and weak thresholds, θs and θw (respectively 4 and 2 times s.d.). (d) Grayscale representation showing samples that cross the weak threshold (gray) and the strong threshold (white). (e) Results of two-threshold flood fill algorithm, showing connected components corresponding to the two spikes in orange and brown. Isolated weak threshold crossings resulting from noise are removed. White lines indicate alignment times of the two spikes. (f) Pseudocolor representation of feature vectors for the two detected spikes (top and bottom). Each set of three dots represents three principal components computed for the corresponding channel (arbitrary units). Note the similarity of the feature vectors for these two simultaneous spikes (top and bottom). (g) Mask vectors obtained for the two detected spikes (top and bottom; 0 represents completely masked, 1 completely unmasked). Unlike the feature vectors, the mask vectors for the two spikes differ. Each set of three dots represents the three identical components of the mask vector for the corresponding channel.

  3. Evaluation of spike detection performance.
    Figure 3: Evaluation of spike detection performance.

    (a) Waveforms of the ten donor cells used to test spike detection performance, in order of increasing peak amplitude (left to right). (b) Fraction of correctly detected spikes as a function of strong threshold θs (left), weak threshold θw (center) and power parameter p (right). Colored lines indicate performance for the correspondingly colored donor cell waveform shown in a; black line indicates mean over all donor cells. (ce) Dependence of the total number of detected events, timing jitter and mask accuracy on the same three parameters.

  4. Evaluation of automatic clustering performance.
    Figure 4: Evaluation of automatic clustering performance.

    (a) ROC curve showing the performance of the masked EM algorithm (blue) and classical EM algorithm (red) on one of the ten hybrid data sets; each dot represents performance for a different value of the penalty parameter. The cyan curve shows a theoretical upper bound for performance, the BEER measure obtained by cross-validated supervised learning. (b) Mean and s.e.m. of the total error (false discovery plus false positive) over all ten hybrid data sets for theoretical optimum (BEER measure), masked EM and classical EM algorithms. For each data set and measure, the parameter setting leading to best performance was used. (c) Effect of varying the penalty parameter, as a multiple of the Akaike information criterion (AIC), on the total error for both algorithms. The dotted line indicates the parameter value corresponding to the Bayes information criterion. Note that the masked EM algorithm performed well for all penalty values. (d) The number of clusters returned by the masked EM algorithm as a function of the penalty parameter.

  5. The wizard for computer-guided manual correction.
    Figure 5: The wizard for computer-guided manual correction.

    (a) Illustration of the measure used to quantify cluster similarity: pij represents the posterior probability with which the EM algorithm would assign of the mean of cluster i to cluster j. (b) To test this measure, the clusters corresponding to hybrid spikes were artificially cut into halves of high and low amplitude. In each case, the similarity measure identified the second half as the closest merge candidate. (c) The wizard identifies the best unsorted cluster as the one with highest quality (top) and finds the closest match to it using the similarity matrix. (d) The wizard algorithm. The best unsorted cluster and closest match are identified. The operator can choose to merge the closest match into the best unsorted, ignore the closest match or delete it by marking it as multiunit activity or noise; the wizard then presents the next closest match to the operator (blue arrows). After a sufficient number of matches have been presented, the operator can decide that no further potential matches could have come from the same neuron and either accept the best unsorted cluster as a well-isolated neuron or delete it as multiunit activity or noise. The wizard then finds the next best unsorted cluster to present to the operator (orange arrows).

  6. Screenshot of the KlustaViewa graphical user interface.
    Figure 6: Screenshot of the KlustaViewa graphical user interface.

    In making the decisions presented by the wizard, the operator has access to information including waveforms (center panel; gray waveforms correspond to masked channels), principal component features (top right), auto- and cross-correlograms (bottom right) and an automatically computed similarity metric for each pair of clusters (inset). To enable rapid navigation, all views are integrated; for example, clicking on a particular channel in the waveform view will update other views to show the selected channels or clusters.

  7. Consistency of manual curation across operators.
    Figure 7: Consistency of manual curation across operators.

    (a) Performance of eight human operators (five experts, three novices) on a drifty hybrid cell requiring manual curation (see Supplementary Figure 13b). A tick indicates correct merging of the split hybrid cell; a cross indicates this merge was not performed. (bd) Consistency of assignments of multiple operators over all cells in this data set. Each submatrix shows the conditional probability of the first operator's cluster assignments given the assignments of the second operator (color scale at bottom of d). (b) Consistency of cluster assignments for spikes marked as well-isolated by all operators. (c) Consistency of cluster assignments for spikes marked as well-isolated by at least one operator. (d) Consistency of whether spikes were marked as well-isolated by different operators. (eg) Operator consistency for the analyses in bd was quantified using the Fowlkes-Mallows index, for which 1 represents complete agreement and 0 complete disagreement. While cluster assignments were highly consistent between all expert operators, the operators were often inconsistent in their judgments of which units were well-isolated.


  1. Buzsáki, G. Large-scale recording of neuronal ensembles. Nat. Neurosci. 7, 446451 (2004).
  2. Wise, K.D. & Najafi, K. Microfabrication techniques for integrated sensors and microsystems. Science 254, 13351342 (1991).
  3. Csicsvari, J. et al. Massively parallel recording of unit and local field potentials with silicon-based electrodes. J. Neurophysiol. 90, 13141323 (2003).
  4. McNaughton, B.L., O'Keefe, J. & Barnes, C.A. The stereotrode: a new technique for simultaneous isolation of several single units in the central nervous system from multiple unit records. J. Neurosci. Methods 8, 391397 (1983).
  5. Gray, C.M., Maldonado, P.E., Wilson, M. & McNaughton, B. Tetrodes markedly improve the reliability and yield of multiple single-unit isolation from multi-unit recordings in cat striate cortex. J. Neurosci. Methods 63, 4354 (1995).
  6. Wilson, M.A. & McNaughton, B.L. Dynamics of the hippocampal ensemble code for space. Science 261, 10551058 (1993).
  7. Recce, M. & O'Keefe, J. The tetrode: a new technique for multi-unit extracellular recording. Soc. Neurosci. Abstr. 15, 1250 (1989).
  8. Harris, K.D., Henze, D.A., Csicsvari, J., Hirase, H. & Buzsáki, G. Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J. Neurophysiol. 84, 401414 (2000).
  9. Henze, D.A. et al. Intracellular features predicted by extracellular recordings in the hippocampus in vivo. J. Neurophysiol. 84, 390400 (2000).
  10. Gold, C., Henze, D.A., Koch, C. & Buzsáki, G. On the origin of the extracellular action potential waveform: A modeling study. J. Neurophysiol. 95, 31133128 (2006).
  11. Einevoll, G.T., Franke, F., Hagen, E., Pouzat, C. & Harris, K.D. Towards reliable spike-train recordings from thousands of neurons with multielectrodes. Curr. Opin. Neurobiol. 22, 1117 (2012).
  12. Lewicki, M.S. A review of methods for spike sorting: the detection and classification of neural action potentials. Network 9, R53R78 (1998).
  13. Hazan, L., Zugaro, M. & Buzsáki, G. Klusters, NeuroScope, NDManager: a free software suite for neurophysiological data processing and visualization. J. Neurosci. Methods 155, 207216 (2006).
  14. Briggman, K.L., Helmstaedter, M. & Denk, W. Wiring specificity in the direction-selectivity circuit of the retina. Nature 471, 183188 (2011).
  15. Berényi, A. et al. Large-scale, high-density (up to 512 channels) recording of local circuits in behaving animals. J. Neurophysiol. 111, 11321149 (2014).
  16. Du, J., Blanche, T.J., Harrison, R.R., Lester, H.A. & Masmanidis, S.C. Multiplexed, high density electrophysiology with nanofabricated neural probes. PLoS One 6, e26204 (2011).
  17. Bouveyron, C. & Brunet-Saumard, C. Model-based clustering of high-dimensional data: a review. Comput. Stat. Data Anal. 71, 5278 (2014).
  18. Ekanadham, C., Tranchina, D. & Simoncelli, E.P. A unified framework and method for automatic neural spike identification. J. Neurosci. Methods 222, 4755 (2014).
  19. Carlson, D.E. et al. Multichannel electrophysiological spike sorting via joint dictionary learning and mixture modeling. IEEE Trans. Biomed. Eng. 61, 4154 (2014).
  20. Calabrese, A. & Paninski, L. Kalman filter mixture model for spike sorting of non-stationary data. J. Neurosci. Methods 196, 159169 (2011).
  21. Franke, F., Natora, M., Boucsein, C., Munk, M.H. & Obermayer, K. An online spike detection and spike classification algorithm capable of instantaneous resolution of overlapping spikes. J. Comput. Neurosci. 29, 127148 (2010).
  22. Quiroga, R.Q., Nadasdy, Z. & Ben-Shaul, Y. Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 16, 16611687 (2004).
  23. Swindale, N.V. & Spacek, M.A. Spike sorting for polytrodes: a divide and conquer approach. Front. Syst. Neurosci. 8, 6 (2014).
  24. Swindale, N.V. & Spacek, M.A. Spike detection methods for polytrodes and high density microelectrode arrays. J. Comput. Neurosci. 38, 249261 (2015).
  25. Buzsáki, G. & Kandel, A. Somadendritic backpropagation of action potentials in cortical pyramidal cells of the awake rat. J. Neurophysiol. 79, 15871591 (1998).
  26. Logothetis, N.K., Kayser, C. & Oeltermann, A. In vivo measurement of cortical impedance spectrum in monkeys: implications for signal propagation. Neuron 55, 809823 (2007).
  27. Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141149 (2001).
  28. Quirk, M.C., Blum, K.I. & Wilson, M.A. Experience-dependent changes in extracellular spike amplitude may reflect regulation of dendritic action potential back-propagation in rat hippocampal pyramidal cells. J. Neurosci. 21, 240248 (2001).
  29. Quirk, M.C. & Wilson, M.A. Interaction between spike waveform classification and temporal sequence detection. J. Neurosci. Methods 94, 4152 (1999).
  30. Kadir, S.N., Goodman, D.F. & Harris, K.D. High-dimensional cluster analysis with the masked EM algorithm. Neural Comput. 26, 23792394 (2014).
  31. Fowlkes, E.B. & Mallows, C.L. A method for comparing 2 hierarchical clusterings. J. Am. Stat. Assoc. 78, 553569 (1983).
  32. Schmitzer-Torbert, N., Jackson, J., Henze, D., Harris, K. & Redish, A.D. Quantitative measures of cluster quality for use in extracellular recordings. Neuroscience 131, 111 (2005).
  33. Hill, D.N., Mehta, S.B. & Kleinfeld, D. Quality metrics to accompany spike sorting of extracellular signals. J. Neurosci. 31, 86998705 (2011).
  34. Owens, J.D. et al. GPU computing. Proc. IEEE 96, 879899 (2008).
  35. Freeman, J. et al. Mapping brain activity at scale with cluster computing. Nat. Methods 11, 941950 (2014).
  36. Comaniciu, D. & Meer, P. Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603619 (2002).
  37. Rodriguez, A. & Laio, A. Machine learning. Clustering by fast search and find of density peaks. Science 344, 14921496 (2014).
  38. Marre, O. et al. Mapping a complete neural population in the retina. J. Neurosci. 32, 1485914873 (2012).
  39. Pillow, J.W., Shlens, J., Chichilnisky, E.J. & Simoncelli, E.P. A model-based spike sorting algorithm for removing correlation artifacts in multi-neuron recordings. PLoS One 8, e62123 (2013).
  40. Saleem, A.B., Ayaz, A., Jeffery, K.J., Harris, K.D. & Carandini, M. Integration of visual motion and locomotion in mouse visual cortex. Nat. Neurosci. 16, 18641869 (2013).
  41. Ayaz, A., Saleem, A.B., Schölvinck, M.L. & Carandini, M. Locomotion controls spatial integration in mouse visual cortex. Curr. Biol. 23, 890894 (2013).
  42. Ecker, A.S. et al. State dependence of noise correlations in macaque primary visual cortex. Neuron 82, 235248 (2014).
  43. Ecker, A.S. et al. Decorrelated neuronal firing in cortical microcircuits. Science 327, 584587 (2010).
  44. Zeater, N., Cheong, S.K., Solomon, S.G., Dreher, B. & Martin, P.R. Binocular visual responses in the primate lateral geniculate nucleus. Curr. Biol. 25, 31903195 (2015).
  45. The HDF Group. Hierarchical Data Format, version 5. http://www.hdfgroup.org/HDF5/ (2014).
  46. Rossant, C. & Harris, K.D. Hardware-accelerated interactive data visualization for neuroscience in Python. Front. Neuroinform. 7, 36 (2013).
  47. Shreiner, D., Sellers, G., Kessenich, J.M., Licea-Kane, B. & Khronos OpenGL ARB Working Group. OpenGL Programming Guide: The Official Guide to Learning OpenGL, version 4.3. 8th edn. (Addison Wesley, 2013).
  48. Swayne, D.F., Cook, D. & Buja, A. XGobi: interactive dynamic data visualization in the X Window System. J. Comput. Graph. Stat. 7, 113130 (1998).

Download references

Author information

  1. These authors contributed equally to this work.

    • Cyrille Rossant &
    • Shabnam N Kadir


  1. UCL Institute of Neurology, London, UK.

    • Cyrille Rossant,
    • Shabnam N Kadir,
    • Maximilian L D Hunter &
    • Kenneth D Harris
  2. Department of Neuroscience, Physiology and Pharmacology, University College London, London, UK.

    • Cyrille Rossant,
    • Shabnam N Kadir,
    • Maximilian L D Hunter &
    • Kenneth D Harris
  3. Department of Electrical and Electronic Engineering, Imperial College, London, UK.

    • Dan F M Goodman
  4. Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, California, USA.

    • John Schulman
  5. UCL Institute of Ophthalmology, London, UK.

    • Aman B Saleem &
    • Matteo Carandini
  6. NYU Neuroscience Institute, Langone Medical Center, New York University, New York, New York, USA.

    • Andres Grosmark,
    • Mariano Belluscio &
    • György Buzsáki
  7. Department of Neuroscience, Baylor College of Medicine, Houston, Texas, USA.

    • George H Denfield,
    • Alexander S Ecker &
    • Andreas S Tolias
  8. UCL Institute of Behavioural Neuroscience, Department of Experimental Psychology, London, UK.

    • Samuel Solomon


C.R., D.F.M.G., S.N.K. and J.S. wrote SpikeDetekt. K.D.H., S.N.K. and D.F.M.G. designed the masked EM algorithm and wrote KlustaKwik. C.R. and M.L.D.H. wrote KlustaViewa. C.R. wrote Galry. S.N.K. analyzed a lgorithm performance. Rat data were recorded by A.G., M.B. and G.B. Mouse data were recorded by A.B.S. and M.C. Marmoset data were recorded by S.S. The procedure for non-chronic laminar recordings with NeuroNexus Vector probes in awake, behaving macaques was developed by G.H.D., A.S.E. and A.S.T., who also collected the data. K.D.H., S.N.K. and C.R. wrote the manuscript with input from all authors.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (4,298 KB)

    Supplementary Figures 1–17 and Supplementary Table 1

  2. Supplementary Methods Checklist (481 KB)

Additional data