Super-resolution modularity analysis shows polyhedral caveolin-1 oligomers combine to form scaffolds and caveolae

Caveolin-1 (Cav1), the coat protein for caveolae, also forms non-caveolar Cav1 scaffolds. Single molecule Cav1 super-resolution microscopy analysis previously identified caveolae and three distinct scaffold domains: smaller S1A and S2B scaffolds and larger hemispherical S2 scaffolds. Application here of network modularity analysis of SMLM data for endogenous Cav1 labeling in HeLa cells shows that small scaffolds combine to form larger scaffolds and caveolae. We find modules within Cav1 blobs by maximizing the intra-connectivity between Cav1 molecules within a module and minimizing the inter-connectivity between Cav1 molecules across modules, which is achieved via spectral decomposition of the localizations adjacency matrix. Features of modules are then matched with intact blobs to find the similarity between the module-blob pairs of group centers. Our results show that smaller S1A and S1B scaffolds are made up of small polygons, that S1B scaffolds correspond to S1A scaffold dimers and that caveolae and hemispherical S2 scaffolds are complex, modular structures formed from S1B and S1A scaffolds, respectively. Polyhedral interactions of Cav1 oligomers, therefore, leads progressively to the formation of larger and more complex scaffold domains and the biogenesis of caveolae.

(2019) 9:9888 | https://doi.org/10.1038/s41598-019-46174-z www.nature.com/scientificreports www.nature.com/scientificreports/ in 3D space that can then be used to reconstruct localizations with significantly improved resolution and has been applied to distinguish invaginated and flattened caveolae based on Cav1 density in clusters 21 . An alternate approach to study point distributions is to visualize them as a graph or network. Graphs are mathematical structures used to model interactions between entities for many systems, with the entities represented as graph nodes and the connections between them as edges 22 . Real world graphs are frequently complex networks that have many different subgraphs or modules 23 . Networks with high modularity have dense connections (edges) between the nodes within modules (sub-networks) and sparse connections between nodes in different modules. The optimization problem of finding divisions within a network (i.e. modules or communities) has been solved via various methods such as normalized-cut graph partitioning and spectral algorithms [24][25][26] . Network and subgraph (module) analysis are therefore ideally suited to define molecular and subgroup organization between labeled molecules within the 3D SMLM point cloud of macromolecular complexes.
Previously, network analysis of SMLM Cav1 data sets in PC3 prostate cancer cells 27 , that express Cav1 but no CAVIN1 and therefore no caveolae 1 , identified two classes of Cav1 scaffolds corresponding to small Cav1 homo-oligomers (S1 scaffolds) that correspond to 8S Cav1 oligomers 14,28 , as well as larger hemispherical S2 scaffolds. The formation of curved Cav1 structures in the absence of CAVIN1 is consistent with Cav1 induction of invaginated h-caveolae in bacteria and supports a role for Cav1 in membrane curvature 5,29 . To identify signatures for caveolae, we compared PC3 prostate cancer cells that lack caveolae with PC3 cells transfected with the CAVIN1 adaptor required for caveolae formation. Larger hollow caveolae were only detected upon transfection of PC3 cells with CAVIN1 (PC3-PTRF cells) 27 and their modular nature supported the polyhedral Cav1 coat structure observed by cryoEM 3,4 .
We now process (using spectral decomposition) the array of distances between localizations to find modules in endogenous Cav1 domains of HeLa cells. To determine the relationship between Cav1 scaffold domains and the multimeric caveolae structure, we leveraged a multi-threshold modularity analysis to extract the modules of the various Cav1 blobs. We then matched the blobs with the sub-modules of the various Cav1 domains based on the similarity (i.e., smaller Euclidean or L2 norm) between their biosignatures. To enhance localization precision, we used an SMLM microscope equipped with real-time nanometer-scale drift correction hardware 30 and included only high-precision localizations to improve localization accuracy 31,32 . With this in-house built SMLM microscope, the localization precision approaches 10 nm 33 , and the drift is limited to 1 nm in the x-y plane and 3 nm in the z axis 30,34 . Modularity analysis and group matching show that S1A scaffolds can dimerize to form S1B scaffolds and oligomerize to form hemispherical S2 scaffolds. S1B scaffolds match the modules that make up the caveolae coat suggesting that the caveolae coat is built progressively by dimerization of S1A scaffolds, composed of the basic polygonal Cav1 units, that then combine to form a polyhedral caveolae coat.

Results and Discussion
Tunable iterative merge algorithm. A major challenge to determining molecular structure by SMLM is defining molecular localizations (i.e. the location of the labeled molecule, in this case Cav1) from the millions of blinks (i.e. 3D spatial coordinates of the labelled Cav1 fluorescent events) generated from the stochastic blinking detected by SMLM. Many blinks derive from the same labeled molecule, particularly when the labeling approach is based on antibody labelling (i.e. dSTORM). The same fluorophore can blink twice in succeeding acquisition frames dependent on the on-off duty cycle 35 and the same molecule can be labeled by different fluorophores, either on the same secondary antibody or on different secondaries bound to the same target protein, introducing error in blink localization relative to the actual antigen. Each of these blink localizations is in addition subject to localization error due to drift and to Gaussian fitting of the PSF. Multiple, distinct blink localizations, therefore, derive from the same molecule and generate a dense non-biological network (NBN) with high degree nodes centred around the actual molecule 27,36 . Network analysis of the biological network, composed of nodes corresponding to predicted molecular localizations of the labeled proteins, requires reduction/consolidation of the NBNs.
Several methods have been proposed to reduce this artifact using temporal or spatial fluorophore information. Annibale et al. 37,38 proposed a method to correct the multiple-blinking in PALM by determining the merging time for mEos2 photoactivatable fluorescent protein. Other methods 39-41 spatially merged nearby localization events. Here, to correct for multiple-blinking and estimate molecular localization, we adopt the iterative merging algorithm of Khater et al. 27 , which iteratively merges nearby nodes (blinks), that are within a threshold merging distance, until convergence is reached. The process starts with the high network degree nodes and continues until the distance between all pairs of reconstructed nodes, that correspond to predicted or estimated molecular localizations, is within the threshold merging distance. Nodes in closest proximity are combined first such that merging is initiated within the dense NBNs and continues progressively until no nodes within the point cloud are closer than the merging proximity threshold (MPT).
3D point clouds of Cav1 from 10 dSTORM images of HeLa cells were processed using the 3D SMLM Network Analysis computational pipeline 27 . To address the multiple-blinking artifact that may bias the quantification process, we applied a tunable MPT from 10-20 nm in steps of 1 nm. Importantly, 4 classes were learnt at each MPT from 10-20 nm. Further, tuning the MPT from 10-20 nm minimally impacted classification, size, modularity, characteristic path and hollowness of all 4 classes of blobs (Fig. 1A). Machine learning blob classification is therefore independent of the merge algorithm for MPTs from 10-20 nm. Not unexpectedly, increasing the MPT reduced the predicted molecular localization number per blob. We set the MPT based on the reported 145 Cav1 proteins per caveolae 42 . An MPT of 19 nm resulted in an average of 142 localizations for the largest H2 blobs (Fig. 1A), that match the PP2 caveolae blobs from PC3-PTRF cells (see Fig. 2A). Figure 1B shows the 3D point cloud of one of the HeLa cells in our dataset at various stages of the pipeline: 1) The 3D point cloud of Cav1-labeled HeLa cell generated by real-time drift control SMLM; 2) After iterative merging and denoising filtration. The denoising module visits every Cav1 event and predicts whether it is signal or noise. This prediction is based on examining the network features for every Cav1 event in our data, as well as www.nature.com/scientificreports www.nature.com/scientificreports/ examining corresponding network features of nodes in a random network. If the network features of a Cav1 event are similar to those of the random network's nodes, then that Cav1 event will be declared as noise and removed. This denoising process will retain the Cav1 clusters (blobs) and filter out noisy localizations and monomeric Cav1; 3) After segmentation into separate blobs and extraction of a 28 feature/descriptor vector for every blob; 4) After unsupervised machine learning to learn the various Cav1 domains from the extracted blobs and their descriptor features.
Group matching. Machine learning identified four groups of Cav1 domains (H1, H2, H3, and H4) in HeLa cells (Fig. 1A). We used the Euclidean distance in 28 dimensions to encode similarity of HeLa groups with groups previously identified in PC3 and PC3-PTRF cells 27 , with similarity proportional to the inverse Euclidean distance. As seen in Fig. 2A, for the groups with larger blobs, H2 matches PP2, corresponding to caveolae, and H1 matches PP1, corresponding to the larger hemispherical S2 scaffolds. For the smaller S1 scaffolds, H4 matches PP3 and H3 matches PP4. Distribution of the different classes of blobs in the different cell types shows that HeLa and PC3-PTRF cells present a similar distribution of Cav1 blobs with slightly more caveolae detected in HeLa cells (Fig. 2B).
Feature analysis after group matching shows that the four HeLa groups match with high degree the four PC3-PTRF groups as well as the S2 and S1A scaffolds present in PC3 cells ( Fig. 2C; see also Supp. Fig. S1 for additional data on blob features). Relative to the PC3 data 27 , we observed a doubling in molecular localizations www.nature.com/scientificreports www.nature.com/scientificreports/ for S1B scaffolds relative to S1A scaffolds and increased modularity of S1B scaffolds in the HeLa data set that we attribute to the improved resolution obtained with the real-time drift control SMLM 30,34 . Increased Cav1 localization number in S1B scaffolds parallels the increased size (X-range) and reduced network density of these clusters relative to S1A scaffolds, reflecting differences between these structures that led to their classification as distinct cluster groups in this and our previous analysis 27 . Indeed, we observe a progressive increased number of localizations (Caveolae > S2 Scaffolds > S1B scaffolds > S1A scaffolds) associated with increased modularity and decreased network density (Fig. 2C). Caveolae are the most modular structures (modularity >0.4), then S2, and S1B. S1A scaffolds have the least tendency to form modules (modularity <0.04).
Small Cav1 S1 scaffolds combine to build larger scaffolds and caveolae. The variable modularity of the different classes of Cav1 blobs led us to extract the blobs' modules and study their features. To make sure that we correctly construct the modules within the blob's network, we used multi-proximity threshold (PT) network analysis (Fig. 3A) to decompose the blobs' networks into modules using spectral analysis. For all the groups, any PT greater than 60 nm renders every blob a single connected component; the average connected component size plateaus and equals the blob size for the different groups at PTs greater than 60 nm. This indicates that at PT >60 nm, all post-merge Cav1 localizations in a cluster are within 60 nm of each other. The number of modules and of Cav1 localizations per module are stable across the PT range from 60 to 170 nm. This range is therefore suitable to determine the number of modules and of Cav1 localizations per module. HeLa caveolae were found to be highly modular containing 6-7 modules of ~29 Cav1 localizations, S2 scaffolds 5 modules of ~14 localizations and S1B scaffolds ~4 modules of 7-8 localizations each (Fig. 3A). S1A scaffolds have the minimum average number of modules of ~2 modules per blobs of ~5-6 localizations each.
Visualization of blobs from the identified groups (Fig. 3B) highlights the modular nature of the various Cav1 structures. At 80 nm, each blob forms one connected component network and extracted modules for every blob are shown in different colors. The presence of small modules (~5-8 molecules) within both S1A and S1B scaffolds is indicative of an additional degree of suborganization within these small scaffold domains. 3D cryoEM tomography identified a network of 3-way junctions and polygonal arrangements of Cav1 protein densities within the caveolae coat 3 . Similarly, cryoEM analysis of Cav1-induced vesicles in bacteria (h-caveolae) present distinct polygonal repeating units on the h-caveolae cage 5 . We propose that the sub-modules that we detect in S1A and S1B scaffolds correspond to these polygonal repeating units that comprise the caveolae coat. The fact that S1A scaffolds form one connected component unit and that the number of localizations of S1A scaffolds matches that of Cav1 homo-oligomers (~14-15 Cav1s) 14,28 suggests that interaction between these polygonal sub-modules forms more stable structural units. This is supported by the identification of larger modules in both S2 scaffolds and caveolae (Figs 3A and 4A).
Most interestingly, the decomposed modules from the different Cav1 cluster groups show a much higher degree of similarity in terms of the shape, topology and network features than the clusters from which they originate (Fig. 4A). For instance, while Cav1 blob classes show a progressive reduction in network density from S1A scaffolds to caveolae, modules from the different blob classes show a similar network density. This suggests that differential interaction between modules is responsible for the changes in network density of the different classes of Cav1 blobs and that these modules form fundamental building blocks of larger Cav1 structures.
Indeed, many features of caveolae modules match S1B scaffolds while S2 and S1B modules match S1A scaffolds. For instance, number of localizations, hollowness, characteristic path, modularity, size, and network density of the caveolae modules (blue bars to the right of graphs) are very similar to their corresponding features in the S1B blobs (magenta bars to left) (Fig. 4A). We quantitatively assessed module-blob similarity across all features using the matching matrix of the features for the various blobs and modules group center using Euclidean distance (Fig. 4B). The column-wise (i.e. the modules) similarity shows that: S2 scaffold modules match S1A blobs; caveolae modules match S1B blobs; and S1B modules match S1A blobs. The close matching of S1B modules with S1A blobs and doubling in the number of modules and localizations of S1B modules relative to S1A blobs suggests that S1B scaffolds represent dimers of S1A scaffolds. Further, PC3 cells that lack caveolae have only S1A and S2 scaffolds (Fig. 2B,C) 27 supporting the matching between S1A blobs and S2 modules reported here. The dissimilarity between caveolae and S2 scaffolds and the modules of any other blob types suggests that these are complex structures made up of primitive S1A and S1B scaffolds.
Overall, our data support a model in which Cav1 is organized into smaller units of 5-8 Cav1 localizations that correspond to the polygonal base units observed by cryoEM analysis of the Cav1 caveolar coat 3,5 . These base units combine to form larger stable structures of which the smallest is S1A scaffolds, that we propose correspond to the previously identified ~14-15 Cav1 homo-oligomers 14,28 . We also identify S1B scaffolds, previously classified as distinct from S1A scaffolds 27 as larger structures that may correspond to S1A dimers. Modularity analysis and group matching show that S1A scaffolds combine to form both S1B dimers and the larger hemispherical S2 scaffold structures. Caveolae modules show better matching and correspond in size to S1B and not S1A modules suggesting that caveolae formation may be a two-step process in which S1A scaffolds first combine to form dimers that then interact to form the caveolae coat (Fig. 5, Video S1). Consistent with a role for S1B in caveolae formation, PC3 cells that lack caveolae and CAVIN1 do not contain S1B scaffolds 27 . As cluster size increases, Cav1 domains show a more pronounced reduction in density relative to their constituent modules. This suggests that interaction between smaller S1 scaffolds to form larger structures, including caveolae, is associated with changes in how modules interact and are organized. Importantly, our analysis based on TIRF microscopy argues that all 4 Cav1 domains, from S1A scaffolds to caveolae are present at the plasma membrane. Based on the role of caveolae as membrane buffers that flatten in response to mechanical stretching 43 , we suggest that these modular interactions are dynamic and reversible.
In this work, we applied multi-threshold modularity analysis to networks/blobs constructed from 3D point clouds of Cav1 localizations acquired via SMLM. Classification of endogenous Cav1 domains in HeLa cells www.nature.com/scientificreports www.nature.com/scientificreports/ matched those previously identified in CAVIN1-transfected PC3 prostate cancer cells 27 . Spectral decomposition allowed us to define the relationship between the different Cav1 blobs and their extracted sub-networks/modules via biosignature similarity and matching across the various domains/groups. This approach is applicable for modular analysis of other oligomeric macromolecular biological structures improving our understanding of their architecture by disassembling them into their basic building components. www.nature.com/scientificreports www.nature.com/scientificreports/

Materials and Methods
Cell culture and immunofluorescent labeling. HeLa cells were tested for mycoplasma by PCR (Applied Biomaterial, Vancouver, BC, Canada) and cultured in Dulbecco's Modified Eagle's medium (DMEM; Invitrogen) containing 10% fetal bovine serum (Invitrogen). For SMLM imaging, cells were plated on fibronectin coated coverslips (No. 1.5 H) for 24 h prior to fixation with 3% paraformaldehyde (PFA) for 15 min at room temperature. Fixed cells were rinsed with PBS/CM (phosphate buffered saline complemented with 1 mM MgCl 2 and 0.1 mM CaCl 2 ), permeabilized with 0.2% Triton X-100 in PBS/CM, and blocked with 10% goat serum and 1% bovine serum albumin (BSA; Sigma-Aldrich Inc.) in PBS/CM before incubation with rabbit anti-caveolin-1 (BD Transduction Inc.) for 12 h at 4 °C and then Alexa Fluor 647-conjugated goat anti-rabbit (Thermo-Fisher Scientific Inc.) for 1 h at room temperature. Primary and secondary antibodies, at saturating concentrations of antibodies, were diluted in SSC (saline sodium citrate) buffer containing 1% BSA, 2% goat serum and 0.05% Triton X-100. Cells were washed extensively after each antibody incubation with SSC buffer containing 0.05% Triton X-100 and post-fixed using 3% PFA for 15 min followed by extensive washing with PBS/CM. Near-infrared fiducial markers (diameter 100 nm; Thermo Fisher Scientific) were added for real-time drift correction. Immediately prior to imaging, cells were mounted and sealed on glass depression slides in freshly prepared imaging buffer (10% glucose, 0.5 mg/ml glucose oxidase, 40 μg/mL catalase, 50 mM Tris, 10 mM NaCl and 50 mM β-mercaptoethylamine (Sigma-Aldrich Inc.) in double-distilled water 20,35 . SMLM Imaging. Imaging of Hela cells was performed on an in-house built SMLM system equipped with an apochromatic TIRF oil immersion objective lens (60×/1.49; Nikon Instruments) and a real-time drift correction system which limits the lateral drift to ~1 nm and the axial drift to ~3 nm. A 639 nm laser line (Genesis MX639, Coherent Inc., USA) was used to excite Alexa Fluor 647 fluorophores and near-infrared fiducial markers. A 405 nm laser line (Laserglow Technologies) was used to activate Alexa Fluor 647. The detailed optical setup and the imaging acquisition procedure were described previously 30,34 .
The dataset used in this work consists of 10 fields of view (FOV) of Cav1-labeled HeLa cells. Each HeLa FOV is 54 × 54 × 1 µm 3 which is 9 times larger than the FOV in the PC3 cell study (18 × 18 × 1 µm 3 ), acquired using a Leica GSDIM microscope equipped with a 160X objective 27 . Each HeLa FOV therefore included multiple cells and this study analyzed a larger number of cells than the PC3 study. We collected 40,000 frames per super-resolution image. The total number of collected localizations that we processed per image ranged from 1.6 to 6.7 million. The 3D SMLM Network Analysis method was able to process the whole FOV. Details of the 3D SMLM Network Analysis approach can be found in 27 . shows that some module features are similar to blob features. For example, the right bars that represent the caveolae modules (blue) are very similar to the left bars that represent the S1B blobs (magenta). (B) We extracted 28 features (e.g. shape, topology, hollowness, network) for every blob and module. The table encodes the module-blob similarity between the different Cav1 domains (blobs) and the modules of each type as Euclidean distances between every pair of group centres.
www.nature.com/scientificreports www.nature.com/scientificreports/ Multi-proximity threshold network modularity analysis. For every blob of 3D localizations, more than one network can be constructed; one per each proximity threshold in the set {PT 1 , PT 2 ,…, PT T } (i.e. blob i has T networks {G i 1 , G i 2 ,…, G i T }, where G i t is composed of a set of nodes V i and edges E i t to form G i t (V i , E i t )). Vi, unaffected by PTt, represents the molecules of blobi and E i t is the set of edges connecting all pairs of molecules interacting within PTt nm.
We leverage a spectral decomposition algorithm to find modules within Cav1 blobs. Given G i t , a blob i 's network at PTt, we find its modules (communities) using the Newman method [24][25][26] . Specifically, we first calculate an adjacency matrix whose element m, n encodes the distance between the m-th and n-th localizations. A spectral decomposition method calculates the eigenvector representation of this adjacency matrix. This eigendecomposition defines the modules as it maximizes the intra-connectivity between Cav1 molecules within a module and minimizes inter-connectivity between Cav1 molecules across modules.
Given G i t , a blob i 's network at PTt, we find the optimal number of modules (communities) using eigenvectors of the network adjacency matrix. At small PTs, the molecules of a blob might not form one connected network (i.e. the network might consist of more than one connected component). A blob network containing non-dense and non-connected regions cannot be used to extract modules (i.e. as per definition, networks with high modularity have dense connection between the nodes within modules and sparse connections between nodes in different modules). Hence, PTs that generate networks with more than one connected component should be avoided when extracting the modules.
Features extraction and module-blob similarity. For every segmented Cav1 blob, we extracted 28 features that are then used to group the blobs into classes using the 3D SMLM Network Analysis pipeline 27 . The learned classes from HeLa dataset are then matched with the previously identified Cav1 domains from PC3 and PC3-PTRF datasets 27 . The modules of the HeLa Cav1 blobs are then extracted using the multi-proximity threshold network modularity analysis described in the previous subsection. We extracted 28 features for every module. To find the similarity/dissimilarity among the extracted modules and the various blobs, we leveraged the matching analysis to match blob modules with intact blobs using the Euclidean distance of group centers. Figure 5. Modular interaction of Cav1 S1A scaffolds forms larger scaffolds and caveolae. Based on the moduleblob matching results (Fig. 4B), S1A blobs are stable primitive structures (simplex) that are used to build up more complex, modular S1B and S2 scaffolds. S1B scaffolds correspond to S1A dimers and are used to build the caveolae coat complex (see Video S1). The figure also shows the hemispherical shape of S2 blobs and the hollow caveolae blobs.