Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A framework for evaluating the performance of SMLM cluster analysis algorithms

Abstract

Single-molecule localization microscopy (SMLM) generates data in the form of coordinates of localized fluorophores. Cluster analysis is an attractive route for extracting biologically meaningful information from such data and has been widely applied. Despite a range of cluster analysis algorithms, there exists no consensus framework for the evaluation of their performance. Here, we use a systematic approach based on two metrics to score the success of clustering algorithms in simulated conditions mimicking experimental data. We demonstrate the framework using seven diverse analysis algorithms: DBSCAN, ToMATo, KDE, FOCAL, CAML, ClusterViSu and SR-Tesseler. Given that the best performer depended on the underlying distribution of localizations, we demonstrate an analysis pipeline based on statistical similarity measures that enables the selection of the most appropriate algorithm, and the optimized analysis parameters for real SMLM data. We propose that these standard simulated conditions, metrics and analysis pipeline become the basis for future analysis algorithm development and evaluation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Examples of simulated data conditions and pre-evaluation of suitability.
Fig. 2: Representative example of a multi-parameter scan and performance analysis of DBSCAN, ToMATo and KDE for Scenario 2.
Fig. 3: Evaluation of clustering algorithms against all ground truth scenarios.
Fig. 4: Evaluation of clustering algorithms against all conditions with added multiple blinking behavior.
Fig. 5: Use of the framework for clustering of non-simulated FGFR1 dSTORM data.

Similar content being viewed by others

Data availability

Both the simulation and the real SMLM data used as the basis for this work are available for download at https://github.com/DJ-Nieves/ARI-and-IoU-cluster-analysis-evaluation without restriction. Source data are provided with this paper.

Code availability

R code for calculating ARI and IoU for clustering results against a ground truth scenario is available for download at https://github.com/DJ-Nieves/ARI-and-IoU-cluster-analysis-evaluation without restriction.

References

  1. Goyette, J. & Gaus, K. Mechanisms of protein nanoscale clustering. Curr. Opin. Cell Biol. 44, 86–92 (2017).

    Article  CAS  PubMed  Google Scholar 

  2. Goyette, J., Nieves, D. J., Ma, Y. & Gaus, K. How does T cell receptor clustering impact on signal transduction? J. Cell Sci. 132, jcs226423 (2019).

    Article  CAS  PubMed  Google Scholar 

  3. Prior, I. A., Muncke, C., Parton, R. G. & Hancock, J. F. Direct visualization of Ras proteins in spatially distinct cell surface microdomains. J. Cell Biol. 160, 165–170 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lukeš, T. et al. Quantifying protein densities on cell membranes using super-resolution optical fluctuation imaging. Nat. Commun. 8, 1731 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Sauer, M. & Heilemann, M. Single-molecule localization microscopy in eukaryotes. Chem. Rev. 117, 7478–7509 (2017).

    Article  CAS  PubMed  Google Scholar 

  6. Heilemann, M. et al. Subdiffraction-resolution fluorescence imaging with conventional fluorescent probes. Angew. Chem. Int. Ed. Engl. 47, 6172–6176 (2008).

    Article  CAS  PubMed  Google Scholar 

  7. Rust, M. J., Bates, M. & Zhuang, X. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat. Methods 3, 793–795 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer resolution. Science 313, 1642–1645 (2006).

    Article  CAS  PubMed  Google Scholar 

  9. Sharonov, A. & Hochstrasser, R. M. Wide-field subdiffraction imaging by accumulated binding of diffusing probes. Proc. Natl Acad. Sci. USA 103, 18911–18916 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Jungmann, R. et al. Single-molecule kinetics and super-resolution microscopy by fluorescence imaging of transient binding on DNA origami. Nano Lett. 10, 4756–4761 (2010).

    Article  CAS  PubMed  Google Scholar 

  11. Jungmann, R. et al. Multiplexed 3D cellular super-resolution imaging with DNA-PAINT and Exchange-PAINT. Nat. Methods 11, 313–318 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Nieves, D. J., Gaus, K. & Baker, M. A. B. DNA-based super-resolution microscopy: DNA-PAINT. Genes (Basel) 9, 621 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Nieves, D. J. & Owen, D. M. Analysis methods for interrogating spatial organisation of single molecule localization microscopy data. Int. J. Biochem. Cell Biol. 123, 105749 (2020).

    Article  CAS  PubMed  Google Scholar 

  14. Khater, I. M., Nabi, I. R. & Hamarneh, G. A review of super-resolution single-molecule localization microscopy cluster analysis and quantification methods. Patterns (NY) 1, 100038 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Ripley, B. D. Modeling spatial patterns. J. R. Stat. Soc. B Methodol. 39, 172–192 (1977).

    Google Scholar 

  16. Cover, T. M. & Hart, P. E. Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13, 21–27 (1967).

    Article  Google Scholar 

  17. van Leeuwen, J. M. J., Groeneveld, J. & de Boer, J. New method for the calculation of the pair correlation function. I. Physica 25, 792–808 (1959).

    Article  Google Scholar 

  18. Rossy, J., Owen, D. M., Williamson, D. J., Yang, Z. & Gaus, K. Conformational states of the kinase Lck regulate clustering in early T cell signaling. Nat. Immunol. 14, 82–89 (2013).

    Article  CAS  PubMed  Google Scholar 

  19. Williamson, D. J. et al. Pre-existing clusters of the adaptor Lat do not participate in early T cell signaling events. Nat. Immunol. 12, 655–662 (2011).

    Article  CAS  PubMed  Google Scholar 

  20. Bar-On, D. et al. Super-resolution imaging reveals the internal architecture of nano-sized syntaxin clusters. J. Biol. Chem. 287, 27158–27167 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Razvag, Y., Neve-Oz, Y., Sajman, J., Reches, M. & Sherman, E. Nanoscale kinetic segregation of TCR and CD45 in engaged microvilli facilitates early T cell activation. Nat. Commun. 9, 732 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Scarselli, M., Annibale, P. & Radenovic, A. Cell type-specific beta2-adrenergic receptor clusters identified using photoactivated localization microscopy are not lipid raft related, but depend on actin cytoskeleton integrity. J. Biol. Chem. 287, 16768–16780 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Mollazade, M. et al. Can single molecule localization microscopy be used to map closely spaced RGD nanodomains? PLoS One 12, e0180871 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Levet, F. et al. SR-Tesseler: a method to segment and quantify localization-based super-resolution microscopy data. Nat. Methods 12, 1065–1071 (2015).

    Article  CAS  PubMed  Google Scholar 

  25. Andronov, L., Orlov, I., Lutz, Y., Vonesch, J. L. & Klaholz, B. P. ClusterViSu, a method for clustering of protein complexes by Voronoi tessellation in super-resolution microscopy. Sci. Rep. 6, 24084 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Mazouchi, A. & Milstein, J. N. Fast Optimized Cluster Algorithm for Localizations (FOCAL): a spatial cluster analysis for super-resolved microscopy. Bioinformatics 32, 747–754 (2016).

    Article  CAS  PubMed  Google Scholar 

  27. Williamson, D. J. et al. Machine learning for cluster analysis of localization microscopy data. Nat. Commun. 11, 1493 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Pike, J. A. et al. Topological data analysis quantifies biological nano-structure from single molecule localization microscopy. Bioinformatics 36, 1614–1621 (2020).

    CAS  PubMed  Google Scholar 

  29. Griffié, J. et al. A Bayesian cluster analysis method for single-molecule localization microscopy data. Nat. Protoc. 11, 2499–2514 (2016).

    Article  PubMed  Google Scholar 

  30. Rubin-Delanchy, P. et al. Bayesian cluster identification in single-molecule localization microscopy data. Nat. Methods 12, 1072–1076 (2015).

    Article  CAS  PubMed  Google Scholar 

  31. Nieves, D. J. et al. The T cell receptor displays lateral signal propagation involving non-engaged receptors. Nanoscale 14, 3513–3526 (2022).

    Article  CAS  PubMed  Google Scholar 

  32. Rand, W. M. Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971).

    Article  Google Scholar 

  33. Hubert, L. & Arabie, P. Comparing partitions. J. Classif. 2, 193–218 (1985).

    Article  Google Scholar 

  34. Jaccard, P. The distribution of the flora in the alpine zone. 1. New Phytologist 11, 37–50 (1912).

    Article  Google Scholar 

  35. Tanimoto, T. T. An Elementary Mathematical Theory of Classification and Prediction (IBM, 1958).

  36. Margalit, A. & Knott, G. D. An algorithm for computing the union, intersection or difference of two polygons. Computers Graphics 13, 167–183 (1989).

    Article  Google Scholar 

  37. Ester, M., Kriegel, H. P., Sander, J., Xiaowei, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD-96 Proceedings 226–231 (AAAI, 1996).

  38. Chazal, F., Guibas, L. J., Oudot, S. Y. & Skraba, P. Persistence-based clustering in Riemannian manifolds. J. ACM 60, 1–38 (2013).

    Article  Google Scholar 

  39. Bohrer, C. H. et al. A pairwise distance distribution correction (DDC) algorithm to eliminate blinking-caused artifacts in SMLM. Nat. Methods 18, 669–677 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Jensen, L. G. et al. Correction of multiple-blinking artefacts in photoactivated localization microscopy. Nat. Methods 19, 594–602 (2022).

    Article  CAS  PubMed  Google Scholar 

  41. Monegal, A. et al. Immunological applications of single-domain llama recombinant antibodies isolated from a naive library. Protein Eng. Des. Sel. 22, 273–280 (2009).

    Article  CAS  PubMed  Google Scholar 

  42. Baragilly, M., Nieves, D. J., Williamson, D. J., Peters, R. & Owen, D. M. Measuring the similarity of SMLM-derived point-clouds. Preprint at https://www.biorxiv.org/content/10.1101/2022.09.12.507560v1 (2022).

  43. Ambrosetti, E. et al. Quantification of circulating cancer biomarkers via sensitive topographic measurements on single binder nanoarrays. ACS Omega 30, 2618–2629 (2017).

    Article  Google Scholar 

  44. Veggiani, G. & de Marco, A. Improved quantitative and qualitative production of single-domain intrabodies mediated by the co-expression of Erv1p sulfhydryl oxidase. Protein Expr. Purif. 79, 111–114 (2011).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

D.M.O. acknowledges funding from BBSRC grant BB/R007365/1. M.H. acknowledges funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, Project-ID 259130777, SFB 1177; GRK 2566). D.M.O. and M.B. acknowledge funding from the Alan Turing Institute.

Author information

Authors and Affiliations

Authors

Contributions

D.J.N. wrote simulation and analysis code, produced simulations, performed cluster analyses, acquired dSTORM data and wrote the manuscript. J.A.P. wrote the simulation code. F.L. and D.J.W. performed analyses. M.B. performed dissimilarity measurements. S.O. and A.d.M. produced the FGFR1 nanobody. J.G., D.S., E.A.K.C., J.A.P., J.-B.S. and M.H. contributed ideas and concepts. D.M.O. conceived the work and wrote the manuscript. All authors contributed to the drafting and writing of the manuscript.

Corresponding author

Correspondence to Dylan M. Owen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Marek Cebecauer and the other, anonymous, reviewers for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–13 and Supplementary Tables 1–38.

Reporting Summary

Peer Review File

Source data

Source Data Fig. 1

Ripley’s K curves for each simulated data scenario.

Source Data Fig. 2

Mean and variance of ARI and IoU scoring for ground truth scenario 2 for parameter scanning of clustering algorithms DBSCAN, ToMATo and KDE.

Source Data Fig. 3

Mean of the maximal ARI and IoU scores for all algorithms for simulation scenarios 2–10.

Source Data Fig. 4

Mean of the maximal ARI and IoU scores for all algorithms for simulation scenarios 2–10 with added multiple blinking.

Source Data Fig. 5

Cluster areas and number of clusters per μm2 identified in FGFR1 dSTORM data using framework-optimized DBSCAN parameters.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nieves, D.J., Pike, J.A., Levet, F. et al. A framework for evaluating the performance of SMLM cluster analysis algorithms. Nat Methods 20, 259–267 (2023). https://doi.org/10.1038/s41592-022-01750-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-022-01750-6

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics