Analyzing Single Molecule Localization Microscopy Data Using Cubic Splines

The resolution of super-resolution microscopy based on single molecule localization is in part determined by the accuracy of the localization algorithm. In most published approaches to date this localization is done by fitting an analytical function that approximates the point spread function (PSF) of the microscope. However, particularly for localization in 3D, analytical functions such as a Gaussian, which are computationally inexpensive, may not accurately capture the PSF shape leading to reduced fitting accuracy. On the other hand, analytical functions that can accurately capture the PSF shape, such as those based on pupil functions, can be computationally expensive. Here we investigate the use of cubic splines as an alternative fitting approach. We demonstrate that cubic splines can capture the shape of any PSF with high accuracy and that they can be used for fitting the PSF with only a 2–3x increase in computation time as compared to Gaussian fitting. We provide an open-source software package that measures the PSF of any microscope and uses the measured PSF to perform 3D single molecule localization microscopy analysis with reasonable accuracy and speed.

of splines can be calculated rapidly, and it is also easy to compute their derivatives. 2D cubic splines were used in the DAOPHOT astronomy package to provide higher order corrections to a 2D Gaussian for the purpose of fitting the locations and magnitudes of stars 12 , or single molecules 13 , the latter using the DAOSTORM algorithm, the adaptation of DAOPHOT for single molecule localization microscopy. B-splines, of which cubic splines are a sub-set, were used previously in localization microscopy to estimate the maximal localization accuracy in x, y and z of an arbitrarily shaped 3D point spread function (PSF) 14 . B-splines were also used in a general method for 3D localization fitting similar to what is presented here 15 .
This current work builds on previous work by further exploring the use of splines for rapid and accurate analysis of 3D super-resolution microscopy data based on single molecule localization. In particular, we provide an open-source maximum likelihood estimation (MLE) based cubic spline analysis package, side-by-side comparisons of its performance with elliptical Gaussian fitting, and a demonstration of its broad application by fitting simulated 3D super-resolution microscopy data derived from several different PSFs.

Results
A cubic spline approximates a one dimensional function using piecewise 3 rd order polynomials of the form: where each f i is valid on small equal length intervals [t i , t i+1 ]. Any function with a continuous first derivative can be approximated with arbitrary accuracy by choosing an appropriate interval size. Here we follow DAOPHOT/ DAOSTORM and choose an interval size that is ½ of the image pixel size as a good compromise between accurately modeling the PSF and minimizing the total number of intervals and coefficients. In Fig. 1 we provide a simple 1D example of the cubic spline construction process that we use in this work. We start with a hypothetical measurement of the PSF (gray bars in Fig. 1). Next the measured PSF is up-sampled by 2x using 3 rd order spline interpolation (red points in Fig. 1). Finally the coefficients a i , b i , c i , d i of the spline (blue line in Fig. 1) for each interval are calculated with the constraint that the polynomials f i (x) and their first derivatives f i (x)/dx are continuous at the red points, the interval boundaries. We chose to use the additional constraint that the second derivative of the spline is zero at the start of the first interval and at the end of the last interval. We use the approach described in ref. 16 to solve for the coefficients of the (natural) cubic spline.
We used the following procedure to measure the PSF in three dimensions. We placed 4x up-sampled PSFs with high photon counts on a xy grid. Then we scanned the z values of the PSFs to create z scan image stacks. Finally, we downsampled the image stacks by 4x in x and y to create the final PSF measurement image stack. Then, using the known emitter locations, images of isolated emitters were cropped out of these image stacks and up-sampled by 2x in xy using the ndimage.interpolation.zoom() function from the Scipy Python package 17 with 3 rd order spline interpolation. Next, the up-sampled images of these emitters were aligned to their centroid positions using the ndimage.interpolation.shift() function and averaged to obtain a single image for each z position. Finally, images covering 50 nm z ranges were further averaged together to generate the PSF for the z position that is at the center of the 50 nm range and the average PSFs determined from a series 50 nm intervals were used to create a 3D array that represents the 3D PSF. To measure the 3D PSF in real experiments, small fluorescent beads can be attached to a coverslip. Then z scan movies can be taken with a piezo positioner. In this situation the true emitter locations are not known, but they can be estimated with sufficient precision for this purpose by using another fitting algorithm such as 3D-DAOSTORM 18 because the fluorescent beads typical emit a very large number of photons. The extension of the cubic spline construction to 3 dimensions is described below. In three dimensions, the cubic spline is: ] j j 1 in y and [u k , u k+1 ] intervals in z. We start with a 3D PSF that is determined by either simulation or experiments, and then up-sampled such that it has an equal number of elements in x, y and z. This is not strictly necessary but having an equal number of elements for each axis simplifies later computations. We then use a series of 1D cubic spline interpolations in x, y and z to further upsample the PSF, by 4 fold in each dimension, to obtain 64 values of the PSF within each element. These 64 values then allow us to uniquely specify the 3D cubic spline f i,j,k (x, y, z) and determine it's 64 coefficients a i,j,k,m,n,o in the intervals that correspond to each element of the PSF.
We then use this cubic spline approximation of the 3D PSF to fit the image of individual emitters and determine their 3D coordinates. Based on the 3D cubic spline formula it can be seen that the evaluation of f i,j,k (x, y, z) at a single point will require 128 operations, as compared to 10 operations to evaluate the elliptical Gaussian that is often used for 3D PSF fitting: Here we've ignored the calculation of the x m y n z o terms as for fitting the evaluation is always done at the same point in each of the spline intervals. Similarly, we have ignored the calculation of the w x (z) and the w y (z) terms in the Gaussian as these are also the same for each point in the PSF. We wrote a simple C program to time how long it took to perform these calculations, and found that the cubic spline calculation is approximately 2.3x slower. However, we note that this is a best case estimate as the cubic spline calculation is more memory intensive and thus is likely to be somewhat slower in general. However, this rough estimate suggests that the computational expense of the cubic spline fitting will be similar to that of Gaussian fitting. The highest accuracy estimation of the parameters of the localization of an emitter is typically achieved using non-linear least squares or maximum likelihood estimation (MLE) based fitting. This can be done efficiently if both the error function and its derivatives are easily calculated for any point in the PSF. As described in ref. 19, the error function for MLE fitting is: And its derivatives with respect to the fitting parameters (p) are: where we have followed ref. 19 and ignored the second derivative terms in the calculation of equation (6). The cubic spline PSF and its derivatives with respect to the fitting parameters x c , y c , z c (the 3D coordinates of the emitter), h (the height of the PSF), and b (the baseline of the PSF) are easy to calculate, and they are: We minimize χ mle 2 using the above equations and the approach described in ref. 19, which is a modified version of the Levenberg-Marquadt algorithm 20 . Given a reasonable starting point the optimal values for x c , y c , z c , h and b can be found with high accuracy in just a few (~10) iterative updates for isolated emitters. The algorithm executes the following pseudo-code on each frame of a single-molecule imaging movie until either no new peaks are found, or the maximum number of iterations is reached.
Note that this algorithm ignores cross-terms for overlapping localizations. Each localization's fitting parameters are updated in a context where all the other localizations are fixed. This increases the simplicity of the algorithm implementation at the price of a reduced convergence speed for overlapping localizations.
This approach is very similar to that used by 3D-DAOSTORM 18 , differing only at the peak finding and peak fitting steps. In order to do peak finding the image is convolved with the spline representation of the PSF at one or more z values, then local maxima in the convolved images are used as the initial locations for the subsequent peak fitting step. We found this approach to be more robust, particularly with PSFs that do not have a single local maxima, such as the double helix PSF 2 , than using local maxima in the original image. When we performed convolutions at multiple z values each of the convolved images was treated as a single plane in a z stack. Then local maxima were found in the z stack and peak fitting was started at the x, y and z values of the local maxima. In our experience the use of multiple z values was only helpful for analyzing data taken with the double helix PSF 2 , probably because among the PSFs that we tested, the double helix PSF is the only one that is not degenerate in z. By this we mean the PSF shape at one z value cannot be approximately recreated by a linear sum of PSF shapes at other z values. This is in contrast with other PSFs such as the astigmatic PSF whose shape at large z offset values can be approximated quite well by a sum of multiple z = 0 PSFs with differing xy displacements.
To test how well a cubic spline with ½ pixel interval size in x and y can capture the shape of non-trivial PSFs, we used the pupil function approach 10, 11 to generate 3D PSFs for a purely astigmatic PSF (Zernike mode z 22 = 1.3) and for the saddle-point PSF described in ref. 3, both of which have been previously used for 3D super-resolution imaging 1,3 . For each condition we generated noise free images of the PSF at z positions ranging from −1 um to 1 um in 10 nm steps. Then we determined the coefficients of the cubic spline for this PSF as described above. We found that the maximum pixel wise amplitude difference between the astigmatic PSF and the cubic spline was 3.3% (at z = 0 nm) over the entire 2 um z range (Fig. 2). Similarly, we found for the saddle-point PSF that the maximum pixel wise difference was 3.2% (at z = 40 nm). We also tested how well an elliptical Gaussian was able to capture the shape of a purely astigmatic PSF, as an elliptical Gaussian is the analytical function typically used Subtract current background estimate from the image. 5: Find localizations in the background subtracted image and add these to the list of all localizations. 6: Subtract localizations from the image. 7: while iterations < max iterations and not all converged do 8: for each localization do 9: if not converged then 10: Add localization to image 11: Levenberg-Marquadt update of localization fitting parameters 12: Subtract updated localization from image 13: if converged then 14: mark localization as converged 15: for each localization do 16: if distance to a brighter neighbor < minimum distance then 17: discard localization 18: mark all neighbors as unconverged 19: Subtract localizations from the image. 20: Estimate background of the localization subtracted image. 21: Save fit parameters of all the localizations.
in astigmatism-based 3D super-resolution imaging. We calculated the best fit elliptical Gaussian to the astigmatic PSF for each z position in the same 2 um z range and found that the error was some what larger than for cubic spline fitting, with a maximum pixel wise difference of 7.7% (at z = 60 nm).
Next we investigated whether the fact that the cubic spline more accurately captured the shape of a purely astigmatic PSF resulted in improved fitting performance. As described above we first create a high photon count z image stack for measuring the PSF, then we create a lower photon count z image stack for measuring the fitting performance. In addition we note that, in this work, we only test the performance of this cubic spline approach on the localization precision of the emitters, but do not test the performance of this approach on the peak finding step here; in fact, the initial values for peak fitting are chosen as the known locations of the peaks used in the simulation. To measure the PSF, we created a 4x up-sampled representation of an astigmatic PSF using the pupil function approach 10,11 . We used this up-sampled PSF to create high photon count 2D images by placing multiple well separated copies of it on a xy grid. To make PSF measurement movies we scanned the z position of the PSF with a 10 nm interval size and down-sampled the resulting 2D images by 4x to generate the final images. We then used these calibration movies to determine the coefficients of the cubic spline representation of the PSF and also to determine the w x (z) and w y (z) calibration curve for elliptical Gaussian z fitting. Then, to measure the fitting performance, a test movie was created using the same approach but with 4000 photons per localization and a constant background of 100 photons. In the test movie, we chose the z position of the emitters to range from −300 nm to 300 nm, as it has been noted previously that the localization precision of the astigmatism approach depends on the z position of the emitter and that relatively high localization precisions have been achieved experimentally in the z range of −300 to 300 nm 1 . These movies were created using a Poisson noise model, with no camera read noise, 600 nm emission wavelength and a 160 nm pixel size. The test movie was analyzed by fitting the single-emitter images to the cubic spline representation and elliptical Gaussian representation of the astigmatic PSF (Fig. 3). The fitting error in x, y and z for each z value was determined as the average distance in x, y and z between the localizations determined by the fitting and their nearest ground-truth emitter positions. We also include the Cramer-Rao lower bound (CRLB) calculation for an ideal estimator. The CRLB was calculated using the spline representation of the PSF and the approach in ref. 8 equation 18.
The x and y localization error was pretty similar for both approaches, with cubic spline fitting giving ~5% better accuracy at the largest z offsets from the focal plane (±300 nm). The z localization accuracy of cubic spline fitting was noticeable better, with an average improvement of ~15% within ±300 nm of the focal plane. Spline fitting is able to come very close to the CRLB for the x and y locations of an emitter, however at the largest z values tested it is ~20% worse than the CRLB.
We note that in real experiments, the PSF often deviates from the ideal PSF derived from the pupil function due to non-ideal optics and optical alignment errors, which can lead to systematic errors in the determination of the x, y and z positions of the emitters when a simple elliptical Gaussian function is used to fit the PSF 21,22 . Since the cubic spline representation can capture these experimental PSFs with arbitrarily high accuracy given a   sufficiently small interval size, we expect cubic spline fitting to provide a greater improvement in the localization accuracy in experiments.
The average amount of time that it takes to analyze a single image with the cubic spline approach as a function of emitter density is shown in Fig. 4. Timing information for the elliptical Gaussian fitting (using the 3D-DAOSTORM algorithm) is provided for comparison. All timing measurements were performed on a laptop computer with an Intel Core i7 CPU running at 2 GHz. Simulated movies (41 um × 41 um field of view) were generated with emitters randomly located in x and y in a + −500 nm z range at densities of 0.03, 0.1 and 0.3 per um2 (41, 136 and 408 emitters respectively). The PSF that we used in these simulations was an elliptical Gaussian whose width in x and y varied as a function of z. To reduce the variance in the timing results, we increased the number of frames of the movies until they took at least 4 seconds to analyze regardless of the emitter density. As 3D-DAOSTORM and the cubic spline algorithm share most of the other components of their analysis pipeline, the timing differences reported are primarily due to differences in how long the localization fitting step takes. At lower emitter densities the overall analysis time is more dependent on how long the various non-fitting steps, such as reading the image from disk and background estimation, take, therefore the computation time of the two algorithms is similar. At higher emitter densities the fitting step takes a larger fraction of the total time, and hence the cubic spline approach takes a proportionally longer time. These results are also consistent with our estimates for the relative computation cost of the two approaches described earlier.
It is worth noting that the fitting time will depend on the number of pixels that the PSF covers. In the above test, we used a spline that covers 14 × 14 pixels in x and y. This is a reasonable size for most PSFs, but a larger size may be necessary for PSFs with a larger spatial extent or for microscope setups with a smaller pixel size. We found a size of between 12 and 16 pixels (in each dimension) was sufficiently large to work well for the PSFs that are typically used in single-molecule imaging and our pixel size of 160 nm.
An important advantage of the cubic spline approach is that it can be used to fit arbitrarily shaped PSFs, which makes it widely applicable to many different 3D localization methods using various engineered PSF shapes. Here, we demonstrate the versatility of the cubic spline approach by using it to analyze simulated data from two additional types of PSFs that have been used for super-resolution imaging, the double helix PSF 2 and the saddle-point PSF 3 . We generated two kinds of simulated movies using these PSFs, a high photon count movie for measuring the PSF and determining the spline and a low photon count movie similar to typical single-molecule localization conditions. The saddle-point PSFs were generated using the pupil function approach 10,11 , and we rotated the saddle-point PSF by 45 degrees relative to the orientation reported in ref. 3 so that the PSF stretches along the x and y axes. The double helix PSF was obtained from the 2016 SMLM localization challenge web-site 23 and spline interpolation was used to position it at arbitrary x, y and z values. In the movies simulating single molecules, we used 4000 photons per localization, a constant background of 100 photons, and a camera pixel size of 160 nm. The x, y and z errors were again determined as the average distance in x, y and z between the localizations determined by the fitting and their nearest ground-truth emitter positions and shown in Fig. 5. From these error plots, it is evident that the cubic spline approach can fit any of these PSFs with an accuracy that approaches the optimal accuracy as determined using the CRLB.
We note that the localization errors reported in this work are all determined by fitting the relatively ideal, simulated images of single molecules with the same photon numbers for each PSF. In real experiments, the photon loss associated with generating relatively complex PSFs, for example by using spatial light modulators or deformable mirrors, also needs to be taken into account when estimating the final localization error.

Discussion
In this work, we demonstrate that a cubic spline approach provides a 5% improvement in x and y localization accuracy and up to a 15% improvement in z localization accuracy relative to elliptical Gaussian fitting for a purely astigmatic PSF. Although the improvement in localization accuracy is moderate, the cubic spline approach has the additional advantage of being highly versatile, as it is easily adapted to fit any experimental realizable PSF. It is also a practical choice as it is only ~2-3x slower than a Gaussian fitting approach at typical emitter densities encountered in super-resolution imaging experiments. To make it easier for others to explore this approach to single-molecule localization analysis, our implementation of this algorithm, Docker images, instructions and an example of its use are available on Github 24 .