Verifying molecular clusters by 2-color localization microscopy and significance testing

While single-molecule localization microscopy (SMLM) offers the invaluable prospect to visualize cellular structures below the diffraction limit of light microscopy, its potential has not yet been fully capitalized due to its inherent susceptibility to blinking artifacts. Particularly, overcounting of single molecule localizations has impeded a reliable and sensitive detection of biomolecular nanoclusters. Here we introduce a 2-Color Localization microscopy And Significance Testing Approach (2-CLASTA), providing a parameter-free statistical framework for the qualitative analysis of two-dimensional SMLM data via significance testing methods. 2-CLASTA yields p-values for the null hypothesis of random biomolecular distributions, independent of the blinking behavior of the chosen fluorescent labels. The method is parameter-free and does not require any additional measurements nor grouping of localizations. We validated the method both by computer simulations as well as experimentally, using protein concatemers as a mimicry of biomolecular clustering. As the new approach is not affected by overcounting artifacts, it is able to detect biomolecular clustering of various shapes at high sensitivity down to a level of dimers.


Figure S1
Single molecule blinking statistics for SNAP-labels Blinking statistics of individual SNAP488 (a) and SNAP647 molecules (b) were recorded on fixed HeLa cells expressing the monomeric SNAP-GPI protein construct. Cells were labelled at sufficiently low concentrations of the SNAP-ligand so that well-separated single molecule signals could be observed. Histograms show the probability of a single label to result in N localizations.

Testing realizations of the null hypothesis
Realizations of the null hypothesis of randomly distributed biomolecules were simulated and tested with 2-CLASTA. We simulated 36 different parameter sets, varying the number of molecules (50; 75; 100 molecules per µm²), the labeling efficiency (40%, 60%, 80% and 100%) and the labeling ratio (3:7, 2:3, 1:1). All other simulation parameters were held constant. (a) Histogram of the p-values obtained for one exemplary parameter set. (b) Cumulative distribution functions of p-values for various parameter settings are shown in light gray. The ideal uniform distribution is indicated by the solid black line, as comparison. For each parameter set we performed 1000 independent simulations

Figure S3
Influence of different fluorescent labels on 2-CLASTA sensitivity We determined the sensitivity as a function of the number of molecules (a), the labeling efficiency (b), and the labeling ratio (c) for the case of molecular dimers, assuming the "realistic" scenario. Included are experimentally derived blinking statistics for SNAP488 and SNAP647 (, black solid line), and for KT3647 and PS-CFP2 (, gray dashed line), as well as analytical blinking statistics assuming a log-normal distribution of the number of blinks (×, gray dotted line). Log-normal distributions were simulated with a mean of 2.54 and 25.4 localizations per biomolecule for the red and blue color channel, respectively; standard deviations were adjusted to 2 and 20 localizations, respectively. If not varied in the respective subpanel, parameters in all simulations were set to a molecular density of 75 molecules/µm², a labeling efficiency of 40%, a labeling ratio of 1:1, and no stage drift. Each data point corresponds to 1000 independent simulations.

Influence of experimental errors on the sensitivity
The sensitivity of 2-CLASTA was determined as a function of unspecifically bound labels (a), of different localization errors (b) and of varying degrees of chromatic aberration (c). (d) Displacement vector field used for simulating the chromatic aberrations shown in (c). In panel (c) we show the mean displacement on the x-axis. We simulated dimers (), trimers () and tetramers (), both for the "ideal" (solid line) and the "realistic" scenario (dashed line). Each data point corresponds to 100 independent simulations.

Figure S5
Influence of on 2-CLASTA sensitivity (a) Influence of the analysis parameter on 2-CLASTA sensitivity for the detection of biomolecular dimers, both for the "ideal" (solid line, ) and the "realistic" scenario (dashed line, ). In addition, we also included simulations for cases in which we reduced the labeling efficiency to 15%, while keeping all other parameters as before (×). (b) Influence of the analysis parameter on 2-CLASTA sensitivity for the detection of circular nanodomains of 100 nm radius, 3 clusters per µm² and 20% of molecules inside the nanodomains, both for the "ideal" (solid line, ) and the "realistic" scenario (dashed line, ). ∞ denotes the maximum nearest neighbor distance occurring in the analysis. Each data point corresponds to 1000 independent simulations.

Examples of simulated scenarios for nanodomains of biomolecules ("ideal" scenario)
The underlying simulated nanodomains (left), the positions of simulated biomolecules (center), and the resulting localization maps (right) are shown for rare small clusters (a), medium clusters (b), large frequent clusters (c), exclusion areas (d), and a random distribution of biomolecules (e). The resulting p-value for each scenario is indicated on the left. Scale bars 250 nm (inset) and 2 µm.

Figure S8
Sensitivity of 2-CLASTA to detect protein enrichment or depletion ("ideal" case) Simulations were performed for circular nanodomains with radii of 20, 40, 60, 80, 100 and 150 nm. The number of domains per µm² was varied between 3 and 25, and the percentage of molecules inside the domains between 20% and 100%. Numbers in individual fields indicate the average number of molecules per domain, and the relative enrichment or depletion of molecules compared to a random distribution with identical average density. The gray scale indicates the fraction of scenarios with a p-value below the significance level =0.05, reflecting the sensitivity. Each field corresponds to 100 independent simulations. The bold black boxes indicate the scenarios shown in the exemplary images in Fig. S7, letters indicate subpanels.

Figure S9
Sensitivity of 2-CLASTA to detect protein enrichment or depletion ("realistic" case) Simulations were performed as described in Fig. S8. Number in individual fields indicate the average number of molecules per domain, and the relative enrichment or depletion of molecules compared to a random distribution with identical average density. The gray scale indicates the fraction of scenarios with a p-value below the significance level =0.05, reflecting the sensitivity. Each field corresponds to 100 independent simulations.

Figure S10
Influence of localization error on the sensitivity of 2-CLASTA to detect protein enrichment or depletion We determined the sensitivity of 2-CLASTA for varying densities of circular domains and percentage of molecules inside the domains, assuming a localization error of 30nm (a) and 70nm (b). Data are shown for a cluster radius of 60 nm for the "ideal" case. Number in individual fields indicate the average number of molecules per domain, and the relative enrichment or depletion of molecules compared to a random distribution with identical average density. The gray sale indicates the fraction of scenarios with a p-value below the significance level =0.05, reflecting the sensitivity. Each field corresponds to 100 independent simulations.

Figure S11
Sensitivity of 2-CLASTA to detect protein enrichment or depletion using a different fluorescent label We determined the sensitivity of 2-CLASTA for varying densities of circular domains and percentage of molecules inside the domains, assuming the blinking statistics for KT3647 and PS-CFP2 (gray). Data are shown for a cluster radius of 100 nm for the "ideal" case (a) and the "realistic" case (b). Number in individual fields indicate the average number of molecules per domain, and the relative enrichment or depletion of molecules compared to a random distribution with identical average density. The gray sale indicates the fraction of scenarios with a p-value below the significance level =0.05, reflecting the sensitivity. Each field corresponds to 100 independent simulations.