Robust, fiducial-free drift correction for super-resolution imaging

We describe a robust, fiducial-free method of drift correction for use in single molecule localization-based super-resolution methods. The method combines periodic 3D registration of the sample using brightfield images with a fast post-processing algorithm that corrects residual registration errors and drift between registration events. The method is robust to low numbers of collected localizations, requires no specialized hardware, and provides stability and drift correction for an indefinite time period.

. Flow chart of the sequence of steps we recommend to minimize drift. (1) After obtaining a reference brightfield z-stack before any data collection, perform brightfield registration at regular intervals (dataset boundaries). This involves collecting a new brightfield z-stack to compare to the reference z-stack, estimating the offset, then moving the sample, repeatedly performing this sequence until a user tolerance is met. The supplied MATLAB code findStackOffset can be used to estimate the offset. We suggest ∼2,000 frames/dataset for dSTORM data and ∼1,000 frames/dataset for DNA-PAINT data. (2) After data collection, using the datasets established during brightfield registration, perform (a) intra-dataset and then (b) inter-dataset post-processing drift correction, for example, by using the supplied MATLAB code driftCorrectKNN. If a very large dataset size was initially chosen, the MATLAB algorithm under user direction can internally reorganize the datasets into fewer numbers of frames, sometimes improving the results. Please see Fig. 1, and sections Algorithm and Results; Breaking up datasets for more details.

2/21
DNA nanorods actin a b Figure S2. 2D estimated drift plots. Each dataset is represented by a separate line segment. Frames are color coded from blue to red to indicate the passage of time. Within each dataset, passage of time is also indicated by the segment width tapering from large to small (like an arrowhead). (a) 80 nm nanorods with spots 40 nm apart produced by DNA-PAINT. The initial guess used for the inter-dataset optimization procedure was the drift-corrected values from the last frame (2500) of each previous dataset. (b) Actin microfilaments in HeLa cells. The initial guess used for the inter-dataset optimization procedure was zero.

3/21
DNA nanorods actin a b Figure S3. Fourier Ring Correlation plots. (a) 80 nm nanorods with spots 40 nm apart produced by DNA-PAINT. (b) Actin microfilaments in HeLa cells. The blue lines represent raw (uncorrected) data, while the green lines are the results given data drift corrected through the post-processing algorithm. The black horizontal line at FRC = 1 7 is the threshold that when crossed by the FRC curve defines the spatial resolution of the image, as the inverse of the spatial frequency at the crossing point. For (a), the estimated resolutions of the uncorrected data compared to the corrected results were 97.2 ± 0.9 nm versus 28.7 ± 0.1 nm, while for (b), these numbers were 72.5 ± 1.4 nm and 60.9 ± 0.6 nm, respectively. . Fluorophore pairings for intra-dataset drift correction. RMSE (and x and y components) between the true and the estimated drift curves plotted versus the number of pairs of blinking events, N p N e , for noisy 2D uniform randomly distributed emitters with (a,c,e) λ increasing from 0.01-10 and fixed N e = 10 5 , (b,d,f) N e increasing from 10 3 -10 6 and fixed λ = 0.2. The imposed drift was (a,b) linear / (c,d,e,f) quadratic, and the intra-dataset fitting was (c,d) linear / (a,b,e,f) quadratic. This is also indicated by L/Q (linear imposed drift/quadratic intra-dataset fitting), etc. above. Results were averaged over N = 100 simulations. The red dotted lines correspond to RMSE = 1 nm.
. Fluorophore pairings for inter-dataset drift correction. Here, a 2,000 frame data collection was broken up into two datasets. RMSE between the true and the estimated drift curves plotted versus the number of pairs of blinking events, N p N e , for noisy 2D uniform randomly distributed emitters with (a,c) λ increasing from 0.01-1.6 and fixed N e = 10 5 , (b,d) N e increasing from 10 3 -10 6 and fixed λ = 0.2. The imposed drift was (a,b) linear / (c,d) quadratic, and the intra-dataset fitting was linear. This is also indicated by Q/L (quadratic imposed drift/linear intra-dataset fitting), etc. above. Results were averaged over N = 100 simulations. The red dotted lines correspond to RMSE = 1 nm. The dashed lines indicate one standard deviation from the mean (solid lines). The blue lines represent raw (uncorrected) data, while the green lines are the results given data drift corrected through the post-processing algorithm. Similarly, the cyan lines are the results after processing through RCC. The black horizontal line at FRC = 1 7 is the threshold that when crossed by the FRC curve defines the spatial resolution of the image, as the inverse of the spatial frequency at the crossing point. For (a), the estimated resolutions of the uncorrected data compared to the corrected results and data processed through RCC were 95.5 ± 0.6 nm versus 29.1 ± 0.1 nm and 30.7 ± 0.1 nm, while for (b), these numbers were 97.7 ± 1.3 nm, 62.6 ± 0.7 nm and 61.0 ± 0.7 nm, respectively. Note also in (b) that the DC and RCC curves overlay each other. Compare with Supplementary Fig. S3 Table S1. Results of driftCorrectKNN (dCK) and RCC applied to a simulated 64×64 pixel star-shaped data collection containing 5,592 localizations and consisting of 1000 frames. The data collection was broken up into different numbers of frames per dataset (leftmost column), and different spatial bin sizes and ∆ rmax for RCC (third and fourth columns), with results given in subsequent columns for mean x-and y-drift per frame (nm) by dCK and RCC. Dashes (-) indicate RCC was unable to perform the specified correction. Here, 1 pixel = 100 nm. The true drift per frame in the simulations is given in the top row.

Frame connection
A blinking event often produces localizations across a sequence of frames. These localizations can be recognized as a single fluorophore and connected together to produce better precision. Frame connection takes the time-ordered localizations computed from a super-resolution dataset and attempts to combine them via a single emitter model in which the maximum distance and maximum frame gap between two localizations are specified (defaults are 1 pixel and 4 frames). The level of significance (LoS, default is 0.01) represents the minimum probability for which the null hypothesis that the two localizations come from a single emitter is not rejected. If the value computed for the probability is greater than the LoS and the other conditions hold, then the two localizations are combined. This process examines all localizations that satisfy the distance and frame gap constraints. If two localizations have positions and localization errors (x 1 , σ 1 ) and (x 2 , σ 2 ), then the combined position and error is given by is the log-likelihood ratio (minus a constant term) that the two localizations represent a single emitter. The likelihood ratio is given by R = L(D|θ ) L(D|D) , where the numerator is the likelihood L of the data D given the parameters θ described above for the single emitter model, and the denominator is the likelihood of D given D, or one. L comes from the product of two Gaussians: The corresponding p-value (which is compared to the LoS) is where N DoF is the number of degrees of freedom and hence the spatial dimension, N dim , in the situation when two localizations are combined into one (N DoF = N data DoF − N model DoF ). Γ and γ are the gamma function and the lower incomplete gamma function, respectively. For N DoF = 1, 2, 3, Note that in 2D (N DoF = 2), the p-value is exactly R.
Post-processing drift correction algorithm MATLAB pseudocode drift correction algorithm operating on (x, y) or (x, y, z) localizations. The drift corrected coordinates, X, and the matrix of drift corrections indexed by dataset number and frame number, ∆X, are produced. P degree is the degree of the polynomial fitting the intra-dataset drift correction. Note that the constant term is assumed to be zero. N frames is the number of frames per dataset. ∆X i is the drift correction for each frame of the ith dataset. p •, j T j is the product of the j th column of p with the vector T , in which each component is raised to the j th power. X c are drift corrected coordinates.