qSR: a quantitative super-resolution analysis tool reveals the cell-cycle dependent organization of RNA Polymerase I in live human cells

We present qSR, an analytical tool for the quantitative analysis of single molecule based super-resolution data. The software is created as an open-source platform integrating multiple algorithms for rigorous spatial and temporal characterizations of protein clusters in super-resolution data of living cells. First, we illustrate qSR using a sample live cell data of RNA Polymerase II (Pol II) as an example of highly dynamic sub-diffractive clusters. Then we utilize qSR to investigate the organization and dynamics of endogenous RNA Polymerase I (Pol I) in live human cells, throughout the cell cycle. Our analysis reveals a previously uncharacterized transient clustering of Pol I. Both stable and transient populations of Pol I clusters co-exist in individual living cells, and their relative fraction vary during cell cycle, in a manner correlating with global gene expression. Thus, qSR serves to facilitate the study of protein organization and dynamics with very high spatial and temporal resolutions directly in live cell.


Supplementary
as an increased rate of detections that begins with a delay after the beginning of imaging, and that abruptly terminates. (c) illustration of a stable cluster. Stable clusters exhibit a high rate of detections from the beginning of acquisition, and reach an exponential saturation as the population of pre-converted molecules is depleted. Clusters that do not exhibit a delay but do display an abrupt termination (putatively transient clusters that assemble around the start of acquisition), or clusters starting with a delay after acquisition then gradually reaching an exponential saturation (putatively clusters that assembled dynamically after start of acquisition then remained stable) were very rare in our experiments.  Page 20

Section A: Installation Instructions
A.1: Installing as a stand alone program (recommended) For most users (including those without a MATLAB license), the simplest installation is through the "stand alone" program that can be downloaded from our github code-sharing portal on the web at www.github.com/cisselab/qSR. Users of the software are invited to ask questions, suggest features, and report bugs by logging issues on www.github.com/cisselab/qSR.

2.
We provide installation tools, with a dialog box providing step by step guidance, for both Windows and Mac OSX operating systems. Download and run the appropriate qSR installer. We also provide sample data, ExampleData_SRL.csv.
Note: We advise copying the sample data from the downloads folder into a new folder. Otherwise, some users may experience issues accessing data in their downloads folder, particularly when running FastJet clustering.
For most users, the stand-alone program is sufficient. These users can proceed directly to Section B. For users who prefer to install from source, or to run and edit qSR within MATLAB, please proceed to section A.2.
A.2: Installing from source code (alternative) Alternatively, the source code is made available for users that own a valid MATLAB license. qSR is distributed as an open source platform for the community. Users with expertise are encouraged to contribute, edit, or modify qSR as they see needed. We request that such upgrades be made available to the community by updating the qSR code directly on Github. We will monitor contributed updates and incorporate broadly appealing new features in future releases/upgrades.
The software makes use of the "Statistics and Machine Learning" Toolbox as well as the "Image Processing" Toolbox, so these will need to be installed to run the code. The software is available at www.github.com/cisselab/qSR. Most features of the software can be run after step 1 below. Continue to step 2 if you wish to run pair correlation analysis, and to step 3 if you wish to use FastJet hierarchical clustering for spatial clustering and automated ROI suggestion.
1. Navigate to https://github.com/cisselab/qSR/releases. This paper was drafted with release version v1.1.0. Download the source code zip file and unzip the program. Add the qSR folder, along with all subfolders, to your MATLAB directory.

2.
For the pair correlation analysis, our software makes use of code developed by Veatch et al. 1 The function get_autocorr is made available in the cited publication as file S1. The code should be downloaded, renamed as get_autocorr.m and added to the qSR software folder.

3.
To perform FastJet Hierarchical Clustering, the FastJet 2-4 code must first be compiled. qSR uses the fjcore distribution of FastJet version 3.2.0, a software package developed by the particle physics community for jet finding and analysis at colliders. FastJet is distributed as c++ source code.

3.
Load a data set. A sample data 7 set is provided in the qSR/ExampleData folder. Set the size of the pixels. For the provided data set, each pixel is 160 nm wide.

5.
To reproduce the super-resolution image shown in Figure 1c, click "Rendered Image".
6. Spatial clustering can be performed using one of two algorithms: DBSCAN 8 or FastJet 2-4 hierarchical clustering. To run DBSCAN, click "DBSCAN Clustering". The parameters used to generate Figure 1e were a Length Scale of 100 nm and a Minimum Points of 8 points. Click "Run" to perform the analysis, and verify the clustering assignment. Clicking "Export Cluster Data" will export the clustering assignment, and the calculated summary statistics; all clustering parameters are automatically saved in the metadata associated with the files. The output csv files may be too large to be opened in Excel. Figure 1f is produces using FastJet; for a detailed protocol, see section B.2.b below.

7.
To perform temporal clustering analysis 9,10 , click "ROI Selection / Analysis". Clicking "Select New ROI" allows the user to select clusters within the pointillist graph. "Delete Current ROI" will delete the currently selected ROI, and "Select and Delete ROIs" allows the user to draw a rectangle on the pointillist image around the ROIs that they would like to delete. The user can navigate between the ROIs by using the "Previous ROI" and "Next ROI" buttons. Clicking "Subsection Current ROI" allows the user to refine the boundaries of the current ROI by drawing a rectangle within the spatial axes in the top left corner of the Temporal Clustering window. To generate a clustering assignment, set a minimum cluster size, and drag the Dark Time Tolerance slider bar until the highlighted regions contain only those points that are both temporally and spatially clustered. Clicking "Apply to All ROIs" will propagate the specified parameters to all ROIs. Clicking "Export Cluster Data" will export the clustering assignment, and the calculated summary statistics; all clustering parameters are automatically saved in the metadata associated with the files. The output csv files may be too large to be opened in Excel.
Supplemental Figure 4: Schematic Representation of the temporal cluster selection interface. Cluster assignments are determined by two parameters: Dark Time Tolerance and minimum cluster size. Clusters are generated by grouping the detections into contiguous regions separated by dark times longer than the specified tolerance. The resulting cluster assignments are shown for three example Dark Time Tolerances.

B.2: Additional Analysis Protocols B.2.a: Pair Correlation Analysis
Pair correlation analysis 1,5,6 is a method to distinguish between single molecule localization patterns, and clustered protein localization patterns. The pair correlation tools in qSR allow the user to filter/preprocess the localization patterns, and generate the pair correlation function of selected regions. For more information on the null model, and how to statistically test for clustering, we refer the user to Sengupta et al. 5 1.

2.
Load Data and set the pixel size.

3.
Set the Preprocess/ Filter Data status to "Spatiotemporal Merging" and set the Radius to the average localization precision (assumed 40 nm by default), and the Dark Time Tolerance to 0 Frames. This will combine multiple consecutive localizations from the same single molecule into one localization to reduce overcounting.

4.
Click "Pair Correlation", then click "Select Region", and select a large, homogenous region (e.g., by avoiding the empty nucleoli in the provided data set).

5.
Click "Run Analysis" and then "Export Results". The results of the analysis are saved within the sample data folder as a csv. Supplemental

B.2.b: FastJet Hierarchical Clustering
FastJet 2 is a software package developed by the particle physics community for the analysis of collision data. This software is used extensively by all four major detectors at the Large Hadron Collider: ALICE, LHCb, ATLAS, and CMS. We make use of the Cambridge-Aachen algorithm 3,4 , to generate a pairwise clustering tree. By running the Cambridge-Aachen algorithm with boost-invariant pt recombination scheme, FastJet is effectively computing a hierarchical clustering tree with centroid linkage, with an asymptotic runtime of N log N where N is the number of localizations per cell. After generating the clustering tree, the user can select the cluster length scale and quickly compute the clusters. Creation of the tree requires no input parameters, and the tree can be quickly cut after choosing a length scale and minimum cluster size. By separating these two steps, the user can quickly assess many different clustering parameters without any substantial computational cost.

2.
Load Data and set the pixel size.

3.
Click "Hierarchical Clustering"; the clustering tree will be automatically generated.

4.
Once the tree is created, different clustering length scales can be tried in quick succession. Set the desired Length Scale and Minimum Size, and click "Find Clusters". For the user's convenience, the software shows clusters for the specified length scale, as well as a smaller and a larger length scale, to help the user more quickly determine an appropriate Length Scale.

5.
Once an appropriate length scale has been set, click "Export Cluster Data" to export the clustering assignment, and the calculated summary statistics; all clustering parameters are automatically saved in the metadata associated with the files. The output csv files may be too large to be opened in Excel. Click "Save ROIs" to use the FastJet cluster centers to create Regions of Interest for later temporal clustering. In order to uniquely assign points to ROIs, overlapping ROIs are merged. Densely overlapping ROIs may generate excessively large ROIs. To address this, the user is advised to either try a different set of clustering parameters, or to manually select ROIs in the problematic regions, using the "ROI Selection / Analysis" tool in the main qSR interface. In single-molecule based super-resolution microscopy, the dynamic assembly and disassembly of a cluster would appear as a transient increase in the rate of detections localized in a region of interest. Depending on the camera acquisition settings and the fluorophore photophysics, single molecules may give rise to multiple, intermittent localizations. The starting point for any analysis of protein clustering dynamics in super-resolution microscopy should start with characterizing the intrinsic behavior of immobile single molecules. This can be achieved by chemical fixation, or by conjugating the fluorophore to an immobile protein, such as histone protein H2B. The single molecule photophysics depends on the choice of fluorophore (for the provided data sets, we used Dendra2), illumination intensities and the local chemical environment, so control experiments should be performed using the same imaging settings as the primary data.
Along with the software, we provide a control dataset obtained by fixation of cells transiently transfected with free Dendra2. The survival plot (1-CDF, i.e. the complement of the cumulative distribution function) of the burst size is represented in Supplemental Figure 7. The measure mean burst size per Dendra2 molecule is 6.3 localizations; any given molecule has some probability of obtaining an even larger count of detections, with (two) rare events seen to yield as many as 150 localizations. When analyzing clusters, it is beneficial to set a threshold of minimum cluster size to exclude individual molecules. For example, setting a minimum size of 24 counts excludes 95% of single molecules. It is important that reports of transient subdiffractive clusters be corroborated with biological controls, such as perturbations that change cluster dynamics in a functionally relevant manner (for Pol II, see serum induction or gene colocalization in Cho et al. 10  qSR takes as input a list of super-resolution localizations generated by single molecule localization microscopy [11][12][13] . The software assumes that the data will be in four columns (frames, x position, y position, intensity) with the first row containing a header. To verify that it is in this format, qSR checks for the filename to end in SRL.csv (e.g. cell1_SRL.csv).
Alternatively, data from QuickPALM 14 or ThunderSTORM 15 can also be used by selecting those options before clicking "Load Data" in the qSR user interface. ThunderSTORM data output is assumed to be a csv file with frame number in the first column, x position (in nanometers) in the second column, y position (in nanometers) in the third column, and intensity in the fourth column.
Section D: Software File Structure and Data Representation qSR was built using MATLAB's GUIDE tool. The user interface can be modified by opening qSR.fig with GUIDE. The source code that determines how qSR functions is located in qSR.m. Callback functions that execute upon pressing buttons in the UI are grouped in qSR.m by the panel in which they are located. Data is stored in the GUI as fields in the handles structure. The raw x position data, for example, can be accessed by calling handles.XposRaw. All functions from panel 3 (Visualize) onward access only the filtered data, e.g. handles.fXpos, which is obtained by scaling to nanometers and applying the appropriate filters.
Buttons that call popup interfaces (pcPALM, DBSCAN, Hierarchical clustering, and Temporal Clustering) create new GUIs with separate handles variables. Data from the main GUI is accessible using mainObject, and mainHandles. Users of the software are invited to ask questions, suggest features, and report bugs by logging issues on www.github.com/cisselab/qSR.