Introduction

DNA compartmentalization into the nucleus allows tight regulation of gene expression in eukaryotes. Transport between the nucleus and cytoplasm occurs solely through nuclear pore complexes (NPCs), which span the nuclear envelope. The NPC, constructed from about 30 different nucleoporin protein subunits, permits free bi-directional flow of ions and small macromolecules (<45 kDa) by passive diffusion, while larger protein cargos are transported by karyopherin family members, termed importins and exportins. For tight control of gene expression in the nucleus, the chromatin is arranged in specific chromosomal territories1, and several discrete and distinct sub-nuclear domains form to serve distinct functions2,3. Examples of such domains include the nuclear lamina4, the nucleolus5, Cajal bodies6, PML bodies7 and nuclear speckles8,9. Many protein components of sub-nuclear domains have been identified through co-localization studies and whole genome screening for GFP-fusion proteins which form intra-nuclear foci10. While not separated by membranes, the constituents of these domains differ and can be dynamically associated through exchange of components. In this study, we focus on understanding how regulated access to the nucleus affects formation of paraspeckles11.

Paraspeckles are a distinct nuclear domain built around the long non-coding RNA, nuclear paraspeckle assembly transcript 1 (NEAT1), formerly known as nuclear enriched abundant transcript 1. The NEAT1 transcript acts as a scaffold for recruitment and assembly of other paraspeckle components12,13,14,15,16. Three core Drosophila behaviour, human splicing (DBHS) paraspeckle proteins were initially identified11: paraspeckle protein 1 (PSPC1), splicing factor proline/glutamine rich (SFPQ, also named PSF and REP1) and the non-POU-domain-containing, octamer binding protein (NONO, also named NRB54 and P54NRB). The expanding number of proteins identified to localize to paraspeckles10,17 reflects data from a recent study mapping interactions between paraspeckle components18. Such evidence highlights the complex nature of this domain and may be used to understand how paraspeckles are assembled.

The cellular functions of paraspeckles are still being discerned. Thus far they have been shown to influence translation, through nuclear retention of A-to-I edited RNA transcripts19 and by the sequestration of proteins20. The finding that NEAT1−/− mouse embryonic fibroblasts are more sensitive to proteasome inhibitor-induced apoptosis than their wildtype counterparts20 was interpreted as indicative of an influence of paraspeckles on cellular survival. This was supported by further evidence from various types of cancer, including breast21, colorectal22,23, glioma24, leukemia25,26, liver27, lung28,29,30,31 and prostate32 that correlate NEAT1 levels with either patient prognosis or cell behaviour. NEAT1−/− mice lack paraspeckles33 but exhibit limited phenotypic defects restricted to mammary gland development34 and corpus luteum formation, resulting in female subfertility35. These contributions to normal and pathological cell activities highlight the value of learning how paraspeckle formation is governed.

Nucleocytoplasmic trafficking is of central importance to nuclear functions. Active nuclear import and export is facilitated by the karyopherin family proteins, comprised of importins and exportins which bind and transport proteins containing nuclear localization signals (NLSs) or nuclear export signals (NESs), respectively. Both importin αs (IMPαs) and importin βs (IMPβs) facilitate nuclear import. The mouse genome encodes six IMPαs and ~twenty karyopherin β family members, each with individual cargo-binding specificities36,37,38,39. In this study, we use the mouse nomenclature, in which each IMPα is a product of its corresponding KPNA gene (e.g. IMPα2 encoded by KPNA2), as previously40. IMPβs can form functional transport complexes in the cytoplasm by binding directly to an NLS-containing cargo protein, while IMPαs typically bind to both the cytoplasmic NLS-containing cargo and to IMPβ1, though an importin beta binding (IBB) domain. These complexes move through the NPC via transient interactions between IMPβs and the nucleoporins that line the NPC inner channel. Within the nucleus, high RAN-GTP levels effect cargo release by binding IMPβ to cause complex dissociation. Conversely, exportins require RAN-GTP to bind and transport NES-containing nuclear-localized cargoes; the export complex dissociates in the cytoplasm following RAN-GTP hydrolysis into RAN-GDP. In addition, some instances of cargo binding to the C-terminal acidic region of IMPα, rather than to its NLS binding groove, can mediate cargo retention in the nucleus41,42,43. Such retained cargoes are imported into the nucleus by different IMPs when there is a shift in the intracellular stoichiometry of IMPs41,42. These findings, from analysis of differentiating embryonic stem cells, demonstrate that regulated nucleocytoplasmic transport is a developmental gatekeeper. Spatiotemporal expression of individual importins and exportins appears to be tightly regulated during development and differentiation of embryonic stem cells41,44,45, muscle46,47 and germline cells40,43,44,48,49,50,51, although the mechanistic basis for this is largely unknown. Thus, an emerging concept in importin biology is that regulated synthesis of nucleocytoplasmic machinery mediates cellular differentiation, with individual IMPs controlling nuclear access of proteins to determine each cell’s transcriptional activity.

PSPC1 (a core DBHS paraspeckle protein) was identified as an IMPα2-interacting cargo protein in the mouse testis at the time of germline sex determination49. This binding relationship is highly likely to be of functional relevance for spermatocytes (meiotic germ cells) and spermatids (haploid germ cells) in the adult testis, as each contains abundant PSPC152 but different amounts of each IMPα2, IMPα3 and IMPα443. We hypothesized that changes in the stoichiometry of individual IMPα proteins are important for cellular differentiation, including during spermatogenesis, and set out to devise a strategy to address this. Our previous work employed HeLa cells, which have been widely used to study the functional outcomes of manipulating importin levels and functionality. By detecting endogenous PSPC1 using immunofluorescence, we observed that IMPα2 levels directly relate to the number of nuclear foci49. This analysis, performed using manual cell cropping from confocal z-series images, demonstrated that per cell paraspeckle numbers vary within an apparently homogenous culture, with typically between 5 and 20 foci present per nucleus53. This variation in endogenous paraspeckle numbers limited our capacity to discern significantly different outcomes against the background of normal biological variation. The present study provides a significant advance in which we develop and apply an automated, high throughput image analysis pipeline to quantify paraspeckles in cells with altered IMPα protein levels and functionality. This pipeline was used to rigorously analyse large numbers of cells, allowing us to measure variability in nuclear foci numbers, nuclear foci parameters (size, intensity of staining, etc.) and nuclear accumulation (nuclear/cytoplasmic ratios) of two core paraspeckle markers (PSPC1 [endogenous and exogenous] and SFPQ [endogenous only]). These parameters were investigated in response to modulating IMPα2, IMPα4 and IMPα6, corresponding to one representative from each of the three IMPα structural clades39,48,54. The results of this analysis demonstrate how the regulation of individual IMPαs alters core paraspeckle protein delivery to paraspeckles, providing the first high-throughput functional analysis of differences in importin protein levels within a single cell population.

Results

A high-throughput semi-automated image analysis pipeline developed to identify cells, nuclei and foci

To investigate how different IMPαs could modulate PSPC1 delivery into the nucleus and into paraspeckles, expression levels and transporter functionality of individual IMPα within HeLa cells were modulated by two independent approaches. In one, transient transfection was used to introduce expression constructs encoding green fluorescent protein (GFP)-tagged isoforms of IMPα2, IMPα4 and IMPα6, corresponding to either full length or truncated ΔIBB variants (summarized in Fig. 1A). Transient over-expression of a GFP-tagged full-length IMPα protein will increase nuclear accumulation of its cargoes. In contrast, IMPαΔIBB isoforms, which lack the importin beta binding (IBB) domain, exhibit a dominant negative effect on cargo accumulation because these still bind cargo proteins but cannot bind IMPβ1 to form a functional transport complex55; the resulting competitive binding will diminish cargo availability for binding endogenous IMPα and thereby reduce cargo nuclear accumulation. For IMPα2, an additional control construct containing two point mutations in the NLS binding groove (lysine replacement at aa192 and arginine at aa396) was used (GFP-IMPα2-ED). These mutations significantly reduce cargo binding55,56, but have little or no effect on endogenous IMPα2 cargo nuclear transport; this isoform serves as a control for non-nuclear transport-related effects arising from GFP-IMPα2 over-expression. The binding capacity and predicted nuclear transport outcomes from transfection with each of IMPα isoform construct are summarized in Fig. 1A. Other control samples included GFP alone, mock-transfected and not-transfected cells. The second approach used to modulate IMPα levels in HeLa cells was siRNA knockdown, targeting IMPα2 and IMPα4.

Figure 1: Overview of experimental and analytical approaches used to identify changes to subnuclear foci in response to modulating the cells nuclear transport capacity.
figure 1

Using either transient transfection with plasmids encoding GFP-tagged IMPα2/α4/α6 variants or siRNA knockdown of IMPα2/α4 the capacity of IMPαs to modulate delivery of PSPC1/SFPQ into the nucleus/paraspeckles was investigated. All images are Z-series captured via confocal laser scanning microscopy, scale bars represent 20 μm. (A) GFP-tagged IMPα isoforms and their functional properties. Binding is indicated as true () or false (✗), with an indication of binding strength (+/−). (B) Overview of image analysis pipeline. From this example merged z-series confocal image (Bi), the immunofluorescent signal for endogenous PSPC1 (Bii) was used to identify foci (Bvii), the nuclear marker DAPI (Biii) was used to identify the nuclei (Bvi) and the immunofluorescent signal for endogenous IMPα2 (Biv) was used to identify the cell body (Bv). The other options for paraspeckle marker, nuclear stain and cell or transfection marker for the GFP-tagged IMPα experiments (GFP-Trans) or siRNA knock down experiments (siRNA KDs) are listed. The full 3D reconstruction of cells, their nuclei and foci (Bviii) is shown for this example image but it should be noted that each sample in an experiment has 49 + of these images. Data about these cells, their nuclei and foci were exported from Imaris in CSV file formats and reorganised ready for statistical analysis using custom python programming language scripts (Bix). Additional data manipulations were performed on a per experiment basis (as detailed for each) using custom R scripts (Bx), before final statistical analysis and outputs were generated using the R software environment for Statistical Computing (Bxi). The Python logo is a trademark of the Python Software Foundation (https://www.python.org; v2.7). The R logo (https://www.r-project.org/logo/) is licensed under CC BY-SA 4.0; the license terms can be found on the following link: (https://creativecommons.org/licenses/by-sa/4.0/).

In all experiments, tiled confocal z-series images were collected for 3D visualisation and analysis, allowing the full volume of numerous (between 143 and 813) individual cells to be analysed in each sample (Pipeline outlined in Fig. 1B). Briefly, using Imaris software, cells, nuclei and PSPC1/SFPQ nuclear foci were identified. Results were exported from Imaris in CSV formats, processed and compiled into a compatible format using a series of custom Python scripts and then imported into the ‘R environment for statistical computing’ for analysis. This approach facilitated analysis of thousands of cells and quantification of hundreds of thousands of nuclear foci in a consistent and non-subjective manner. All raw data, exported from Imaris along with the custom python, R and shell scripts which compile and analyse these data, are provided in Supplementary Dataset SD1.

Cell gating was initially set to capture a very low GFP signal level, corresponding to the auto-fluorescence signal level. In this way, the cytoplasm and nucleus of every cell was identified, regardless of whether it was transfected or not. This approach removed the need to use an additional cell body marker, maximizing available fluorescence channels and minimising photo-damage by reducing laser exposure. To ensure that only transfected cells were analysed, a final mean GFP intensity threshold per cell (higher expression level of GFP) was later applied to the data using the R environment for statistical computing (Fig. 1Bx). This allowed the GFP thresholding to be applied to all test and control samples simultaneously, with adjustments made to identify a threshold where the detected cell number approached zero in the control samples. GFP thresholding was also selectively withheld from the control samples to extend analyses to non-transfected and mock-transfected cells using the same base parameters for cell, nuclei and foci detection. Overall, this analysis approach enabled accurate cytoplasmic identification of cells that had relatively low GFP-IMPα expression levels to achieve comprehensive measurements for all cells within each sample.

The nuclear detection threshold was set to ensure the nucleus was identified even in cells with a low level nuclear marker signal. Although this could slightly inflate the detected volume of each nucleus, this approach was chosen to avoid missing parts of some nuclei which could underestimate nuclear foci numbers. Using the R environment for statistical computing, nuclei on the edge of an image in the X, Y or Z image planes, and therefore likely to be incomplete nuclei, were excluded from the data sets. Subsequently, all cells without nuclei were also removed from the data sets (see Fig. 1Bx).

In the GFP-IMPα transient transfection study, the non-transfected (Not-Trans-C), mock-transfected (Mock-C) and GFP-transfected (GFP) control sample parameters for nuclear foci varied (Supplementary Tables S5, S6 and S7). We hypothesize that these differences reflect the physiological state of individual cells from each group in regard to cell cycle or local microenvironment differences at the time of sampling. This would be consistent with reported paraspeckle roles, but spotlights the paucity of knowledge about the inherent variability of paraspeckles within a population, and whether these are dynamically modulated within a cell in response to particular conditions. These results lead us to conclude that comparing outcomes within a single IMPα subtype, in which either cargo or IMPβ1 binding has been manipulated, is appropriate, while comparing between different IMPα subtypes should be undertaken cautiously, and with this information in mind.

Functional IMPα protein levels determine endogenous PSPC1 localization to paraspeckles

To assess the accuracy of the automated analysis pipeline, we initially compared its outcomes with those from our previous analysis using manual selection of individual cells49. HeLa cells were transiently transfected to express GFP-tagged IMPα2 or IMPα6, as each binds PSPC1 in a yeast two hybrid system and in an ELISA-based importin binding assay49. During cell/nucleus/foci detection in Imaris, nuclear PSPC1 foci were identified using the immunofluorescent signal for endogenous PSPC1 with parameters matching our previous study49 to allow a direct comparison. All IMPα2 samples produced results similar to those previously reported49, with GFP-IMPα2-ED control values intermediate to those obtained with the other two IMPα2 isoforms (summary comparison in Fig. 2; detailed comparison in Supplementary Table S1). This congruency demonstrates that automated detection of cells and nuclei is of comparable accuracy to the laborious manual cell image cropping. For all paraspeckle-related endpoints, all but one of the GFP-IMPα2ΔIBB sample values were significantly reduced relative to the GFP-IMPα2-FL values (Table 1 and Fig. 3A). The one exception was the geometric mean (GM) PSPC1 voxel intensity (per foci), which was not significantly reduced (Table 1 L).

Figure 2: Comparing outcomes from manual versus automatic cell detection methods.
figure 2

Comparative outcomes of modulating functional IMPα2 levels on PSPC1 nuclear transport and paraspeckle localization in HeLa cells. Cells were transiently transfected with constructs encoding GFP-tagged IMPα2 variants as indicated (see Fig. 1A for predicted function). Paraspeckles were identified using indirect immunofluorescence to detect endogenous PSPC1. These measures were assessed within groups for the entire cell population (A), per foci positive cell populations (B) or on a per foci basis (C). Most measures are shown as geometric means (GM); error bars represent 95% confidence intervals.

Table 1 Outcomes of modulating IMPα expression and transport function on endogenous PSPC1-positive nuclear foci.
Figure 3: Outcomes of modulating functional IMPα2/α4/α6 levels on paraspeckle marker (endogenous PSPC1/SFPQ or exogenous DsRed2-PSPC1) nuclear transport and paraspeckle localization in HeLa cells.
figure 3

HeLa cells were transiently transfected with constructs encoding GFP-tagged IMPα2/α4/α6 variants (see Fig. 1A for predicted function) as indicated, with labels at the bottom of each panel and consistent colors used throughout. Paraspeckles were assessed within experimental groups using indirect immunofluorescence with an Alexa Fluor 546 (A546) secondary antibody to detect endogenous PSPC1 (A), using indirect immunofluorescence with an Alexa Fluor 546 (A546) secondary antibody to detect endogenous SFPQ (B) or through exogenous PSPC1 by co-transfecting with a plasmid encoding DsRed2-PSPC1 (C). After analysis pipeline as outlined in Fig. 1B, the primary measures are presented as bar graphs and scatter plots with overlaid box plots indicating the mean and interquartile ranges for each experimental group. Samples with statistically significant differences within IMPα groups are indicated (*). The primary measures include the percentage of foci positive cells (i), the ratio fluorescent signal within the nucleus and cytoplasm (Fn/c) for the paraspeckle marker used (ii), the number of foci detected per cell (iii), the sum foci associated fluorescent signal per cell for the paraspeckle marker used (iv) and the sum foci associated fluorescent signal per foci for the paraspeckle marker used (v). These measures were assessed within groups per entire cell population (i, ii), per foci positive cell populations (iii, iv) or on a per foci basis (v).

To interrogate nuclear accumulation, the mean of the fluorescent signal in the nucleus (Fn) and cytoplasm (Fc) was converted to a ratio (Fn/c) for each cell44,57. Mean PSPC1 Fn/c values for all IMPα2 samples increased with increasing IMPα2 functionality as expected (Table 1E and Fig. 3Aii; ΔIBB [lowest function]: 2.04; ED: 2.08; FL [highest function]: 2.69). The other GFP-IMPα2-FL sample parameters were unchanged or slightly increased compared with those from the GFP-IMPα2-ED control. The only significantly different result was the PSPC1 Fn/c value, indicating that the FL isoform significantly enhances PSPC1 nuclear accumulation.

Our previous demonstration of IMPα6 binding to PSPC1 in yeast two hybrid and ELISA assays was extended here by measuring paraspeckle numbers and size in HeLa cells relative to IMPα6 functionality. Significant differences in several parameters were recorded when comparing the FL and ΔIBB variants of IMPα6. The ΔIBB variant exhibited a lower proportion of foci-positive cells, reduced nuclear accumulation of PSPC1 (PSPC1 Fn/c), a lower total volume of foci per cell and a reduction in the total signal from PSPCI-foci per cell when compared to the FL isoform (Table 1 and Fig. 3A). Although the number of foci measured per cell was reduced in the ΔIBB sample (FL:6.00; ΔIBB:4.41), this outcome did not reach significance, which most likely reflects the low proportion of cells containing nuclear foci in these samples (54.8% in FL [n = 91]; 30.8% in ΔIBB [n = 66]). This finding indicates that changing levels of IMPα6 will also influence PSPC1 nuclear accumulation and the characteristics of PSPC1-positive nuclear foci, as recorded for IMPα2.

These data demonstrate IMPα2 and IMPα6 can each modulate endogenous PSPC1 nuclear accumulation and localization to paraspeckles. In addition, the direct comparison to our previous work with IMPα2 validates the automated analysis pipeline as an effective tool for detecting these outcomes.

Functional IMPα protein levels modulates endogenous SFPQ localization to paraspeckles

To determine if changes in IMPα expression levels that altered PSPC1 nuclear accumulation and localization into paraspeckles also affected another core DBHS paraspeckle marker, we examined endogenous SFPQ in HeLa cells transiently transfected to express GFP-tagged IMPα constructs. IMPα2 variants influence SFPQ localization to nuclear foci in a manner similar to that recorded for PSPC1 localization (Table 2 and Fig. 3B). The percentage of cells containing SFPQ nuclear foci is greatly increased in the IMPα2-FL group (83.9%), and slightly decreased in the IMPα2ΔIBB group (57.8%), compared to the IMPα2-ED control sample (58.7%); ED and ΔIBB values are each significantly different (p = 0.0000) from the FL outcome (Table 2D and Fig. 3Bi). The Fn/c for SFPQ was significantly reduced (p = 0.0000) in the ΔIBB (3.21) and ED (2.80) groups in comparison to IMPα2-FL (4.75); the odds ratios when compared to the FL set to 1.0 are 0.675 for ΔIBB and 0.590 for ED (Table 2E and Fig. 3Bii). No other paraspeckle parameters displayed statistically significant differences. These outcomes suggest that SFPQ transport is affected by IMPα2 functionality, but its relationship to paraspeckles is not.

Table 2 Outcomes of modulating IMPα expression and transport function on endogenous SFPQ-positive nuclear foci.

Transfection with IMPα4 isoforms resulted in remarkable and significant differences measured between IMPα4-FL and IMPα4ΔIBB samples, across the population, cell and individual foci parameters (Table 2 and Fig. 3B). Many parameters showed a higher value in the IMPα4-FL sample, including: the percentage of cells with SFPQ nuclear foci (FL:91.3%; ΔIBB:60.2%), the number of foci per cell (FL:8.13; ΔIBB:5.14), the average volume of foci (FL:0.320 μm3; ΔIBB:0.202 μm3) and the sum of the SFPQ staining intensity per foci (FL:1204; ΔIBB:728). This demonstrates that IMPα4 functionality can determine SFPQ localization to nuclear foci.

The IMPα6-FL group contained a significantly higher percentage of cells with nuclear foci (85.7%) than did the IMPα6ΔIBB group (41.2%; p = 0.0000; Table 2D and Fig. 3Bi). A significantly greater Fn/c per cell for SFPQ (FL:5.32; ΔIBB:2.66, p = 0.0000), and number of nuclear foci per cell (FL:6.85; ΔIBB:5.25, p = 0.0046) was measured within the IMPα6-FL group compared to the IMPα6ΔIBB group (Table 2E,H and Fig. 3Bi,Biii). The absence of other statistically significant differences indicates that, while the number of paraspeckles per cell differs depending on IMPα6 functionality, the parameters of individual foci (volumes and SFPQ) do not.

These results show that changes in the functional levels of individual IMPα influence multiple paraspeckle parameters, including the localization of specific, key components. Thus the relative intracellular abundance of individual importins, and their availability for cargo binding, will affect paraspeckle formation.

Functional IMPα protein levels modulate exogenous dsRed2-PSPC1 localization to paraspeckles

We predicted that the changing levels of specific cargos would also alter how IMPαs influence paraspeckle parameters. To test the impact of IMPα functionality when cargo is elevated, exogenous PSPC1 (dsRed2-PSPC1) and GFP-tagged IMPα constructs were co-transfected into HeLa cells.

A greater but not significantly different (p = 0.0614) proportion of cells contained PSPC1 foci in the IMPα2-FL (51.3%) compared to IMPα2-ED samples (43.7%), while this was significantly lower in the IMPα2ΔIBB group (38.2%; p = 0.0042, compared to FL; Table 3D and Fig. 3Ci). Only the DsRed2-PSPC1 Fn/c value was statistically significantly higher in the FL sample relative to the IMPα2-ED (p = 0.0001) and ΔIBB (p = 0.0019) groups (FL:1.86; ΔIBB:1.62; ED:1.60; Table 3E and Fig. 3Cii). We interpret this as indicating that cells have an increased capacity for cargo transport (above endogenous levels) in the presence of increased levels of transport-competent IMPα2. A direct comparison of exogenous versus endogenous PSPC1 data is shown in Supplementary Table S1. As expected, samples containing exogenous PSPC1 have a more and larger nuclear foci containing more PSPC1, relative to samples containing only endogenous PSPC1.

Table 3 Outcomes of modulating IMPα expression and transport function on exogenous dsRed2-PSPC1-positive nuclear foci.

No significant difference in the percentage of cells with nuclear DsRed2-PSPC1 foci was recorded between samples expressing FL (38%) or ΔIBB (39.7%) IMPα6 variants. However, the Fn/c (FL:1.56; ΔIBB:1.86, p = 0.0000) and number of paraspeckles per cell (FL:9.04; ΔIBB:15.58, p = 0.0063) differs significantly. Unexpectedly, the ΔIBB construct displays higher Fn/c and paraspeckle number per cell, and the average foci volume per cell trends higher (FL:1.29; ΔIBB:2.59, p = 0.0080, not considered significant with Bonferroni correction).

Analysis of IMPα4 variants revealed significant effects on several outcomes measured for exogenous PSPC1, but only when considered at the level of individual cells. The IMPα4-FL values were higher than ΔIBB levels for: percentage of cells with nuclear foci (FL:59.6%; ΔIBB:37%, p = 0.0000), number of foci per cell (FL:14.94; ΔIBB:7.78, p = 0.0001) and cumulative volume of foci (FL:2.48; ΔIBB:1.00, p = 0.0001). No significant reduction in DsRed2-PSPC1 Fn/c was recorded, which was different than the significant decreases observed with the transport-deficient isoforms of either IMPα2 or IMPα6. This suggests transport of exogenous PSPC1 is not regulated by IMPα4 levels, but that IMPα4 does influence PSPC1 localization into paraspeckles. This aligns with ELISA-based assays that measured IMPα4 binding to PSPC1 only at high IMPα4 concentrations, with weaker binding than was recorded for IMPα2 or IMPα649.

Expression levels of IMPα2 or IMPα4 correlate with PSPC1 nucleocytoplasmic distribution

As an alternative approach to measuring the outcomes of modulating importin function, IMPα2 or IMPα4 knockdown by targeted siRNA was followed by simultaneous detection of either endogenous PSPC1 or SFPQ (each in duplicate experiments) and the relevant IMPα by indirect immunofluorescence (Supplementary Tables S8–S15). The mean intensity of IMPα2 per cell on a population basis was reduced across the four experimental samples by introduction of siRNA targeting IMPα2 when compared to the scrambled siRNA control. The IMPα2 siRNA versus control signals were 0.43 and 0.59 for samples in which PSPC1 was detected, and 0.65 and 0.87 for SFPQ samples (calculated from values in Supplementary Tables S8–S15), demonstrating effective IMPα2-targeting by these siRNAs. This was confirmed by Western blot with cell lysates (data not shown). Although the attempted siRNA knockdown of IMPα4 was not consistently effective, these samples provided cell populations with a range of IMPα4 levels that were used in subsequent analyses.

A faster approach for image acquisition was trialled, using a resonance scanner to capture confocal z-series images for these samples. While scanning times were reduced to approximately 25% (from 32 days with galvo-scan imaging, to 8 days using the resonance scanner), reduced image quality made robust identification of foci impossible. As a consequence, outputs requiring foci detection are not presented or discussed for these experiments. Fn/c measurements, which require only detection of the cell nucleus and cytoplasm, were reliably determined from these images, allowing the influence of each IMPα on PSPC1 or SFPQ nuclear accumulation to be determined following resonance scanning. The PSPC1 Fn/c values in the IMPα2 siRNA knockdown samples were reduced to ~80% of their scrambled counterparts (PSPC1: 0.78 and 0.80; SFPQ: 0.81 and 0.94; calculated from values in Supplementary Tables S8–S15).

To explore the flexibility and power of creating hierarchically linked outputs that describe multiple aspects of each cell, a different analysis approach was applied. Instead of making comparisons between siRNA knockdown groups, these outputs based on fluorescence signal were considered across the whole population of cells, regardless of treatment group. Correlations between PSPC1 Fn/c and the IMPα signal within each cell are presented in Fig. 4. The upward sloping line in Fig. 4Ai indicates that, as IMPα2 levels increase within cells, PSPC1 Fn/c values also increase (correlation coefficients of 0.169 and 0.191 obtained for two independent experiments). The IMPα4 samples generated the opposite result, showing a reciprocal relationship between PSPC1 Fn/c and IMPα4 levels (downwards sloping trend line, Fig. 4Bi; correlation coefficients of −0.294 and −0.350 for each of two experiments). These results provide an additional indication that IMPα2 is a nuclear transporter for endogenous PSPC1 in HeLa cells, and they suggest that IMPα4 is not. An alternative explanation for the lack of correlation with IMPα4 levels may be that the expression across the cell population is relatively low and uniform, yielding a small dynamic range of signal. A similar analysis for SFPQ did not yield consistent results between replicates (Supplementary Figure S1); we interpret this to indicate IMPα2 and IMPα4 are not the only transporters for this paraspeckle protein because knockdown did not alter SFPQ distribution, while over-expression of IMPαs did (Fig. 3).

Figure 4: PSPC1 nuclear accumulation correlates with cellular IMPα levels in HeLa cells.
figure 4

Population wide correlations were observed, where treatment groups (Not transfected, mock transfected, scrambled-10, scrambled-25, IMPα-10, IMPα-25) within two independent experiments (EXP#1 and EXP#2) were pooled and the total number of cells (n) were used to produce correlation coefficients (c) between cellular IMPα intensity and the ratio of PSCP1 nuclear to cytoplasmic intensity (Fn/c).

Finally, non- and mock- transfected cell groups alone were examined to study cell populations with a broad range of endogenous IMPα expression in the absence of any importin manipulations (Fig. 4Aii and Bii). The overall trends observed were similar to those obtained from the complete set of siRNA knockdown samples (Fig. 4Ai and Bi). This result confirms the value of previous studies, in which Fn/c values correlate with IMP-based transport outcomes. Most importantly, the result of analyzing cells which have not been transfected demonstrates how application of a high throughput image analysis system can yield sophisticated and functionally relevant outcomes using only indirect immunofluorescence to detect endogenous cargo(s) and IMP proteins. This provides an exciting avenue for studying nucleocytoplasmic transport within intact tissues, by examining developmental systems in the absence of manipulations.

Discussion

Development and application of an automated image analysis pipeline enabled the rigorous interrogation of how IMPα functionality affects paraspeckle number and size. Imaris software allowed non-subjective and relatively fast batch-processing of hundreds of 3D images to identify cells, nuclei and foci. This was linked into an analysis pipeline using python and R scripts that extended the flexibility of data manipulation and provided access to a diversity of statistical analysis tools and graphical outputs. To also investigate nuclear transport of two key paraspeckle components, PSPC1 and PSF, distinct from their localization for nuclear foci formation, the pipeline calculated the ratio between the fluorescent nuclear and cytoplasmic signals for these proteins (Fn/c). Manual Fn/c measurement is very time-consuming, potentially subjective, and cannot be accurately applied to samples with uneven fluorescent signals that can arise from protein localization to subcellular structures, such as paraspeckles. Because our approach segments the entire nucleus and cytoplasm in 3D, brighter or darker structures in either compartment are accounted for in the measured means. Once appropriate cell/nucleus/vesicle detection parameters have been determined, many images/cells can be analysed easily, with high quality 3D image acquisition times then becoming the primary limiting factor for extending cell analysis numbers. At present, achieving the correct balance between lengthy imaging times and final image quality is a challenging aspect of such high throughput experiments. We trialled the use of resonance confocal scanning to accelerate image acquisition for the IMPα siRNA experiments. The associated loss of image quality made this approach inappropriate for sub-organelle feature scale quantification, however analyses of organelle feature scales (such as Fn/c outcomes) for whole cell populations provided meaningful measurement of endogenous nucleocytoplasmic transport activity.

Using an automated high-throughput image analysis pipeline can generate an overwhelming amount of data across multiple parameters in a relatively short time frame; sifting through this to identify the meaningful results can be both challenging and tedious. To help solve this problem we included principal component analysis (PCA) as part of the analysis pipeline. Through PCA, multiple parameters across groups of each experiment were condensed into two principal components, allowing a simple 2D relationship across all included parameters to be generated (Fig. 5). In addition to providing an accessible summary of the results, PCA also helps identify key outcomes during the initial stages of data analysis, thereby providing strategic directions for subsequent data interrogation.

Figure 5: Modulating functional IMPα2/α4/α6 levels affected a plethora measurable paraspeckle related outcomes that can be simultaneously visualised using principal component analysis (PCA).
figure 5

The results of transiently transfecting HeLa cells with constructs encoding GFP-tagged IMPα2/α4/α6 variants (Tables 1, 2 and 3 and Fig. 2) were used to perform PCA, allowing simultaneous comparisons of multiple parameters and revealing strong patterns between groups. In each experiment PC1 explains >99% of the variance across all parameter and therefore the distances between groups across the X axis (PC1) should be considered as the primary delineator. Paraspeckles were assessed within experimental groups using indirect immunofluorescence with an Alexa Fluor 546 (A546) secondary antibody to detect endogenous PSPC1 (A), using indirect immunofluorescence with an Alexa Fluor 546 (A546) secondary antibody to detect endogenous SFPQ (B) or through exogenous PSPC1 by co-transfecting with a plasmid encoding DsRed2-PSPC1 (C). Parameters used to compare the geometric means of groups within experiments using a specific paraspeckle marker (PSM; A:PSPC1, B:SFPQ, C:DsRed2-PSPC1) were “% cells positive for foci”, “cytoplasmic PSM intensity”, “nuclear PSM intensity”, “PSM Fn/c per cell”, “PSM intensity per cell”, “number of nuclear foci per cell”, “sum volume of nuclear foci per cell”, “sum nuclear foci PSM intensity per cell”, “nuclear foci volume”, “nuclear foci PSM intensity” and “sum nuclear foci PSM intensity”.

The results in this study collectively demonstrate that modulating functional levels of IMPα2, IMPα4 and IMPα6 will impact nuclear import and delivery of PSPC1 and SFPQ to nuclear paraspeckles, and also provides evidence that the relative abundance of individual IMPαs and the cargo paraspeckle protein(s) influences these outcomes. In addition to reinforcing the knowledge that PSPC1 is a transport cargo of IMPα249, the manipulation of IMPα6 functionality in HeLa cells provides new evidence that this importin can also effect nuclear transport of this core paraspeckle protein. The transport role of IMPα4 is less clear, because the Fn/c of over-expressed PSPC1 was not significantly different between samples co-transfection with either fully functional (FL) or transport-deficient (ΔIBB) isoforms. This contrasts with IMPα2 and IMPα6, for which the ΔIBB variants had lower nuclear-localized PSPC1 relative to FL counterparts. The endogenous SFPQ dataset (Fig. 3B and Table 2) differs, with IMPα2, IMPα4 and IMPα6 isoforms each influencing nuclear accumulation (Fn/c) and the percentage of foci-positive cells. Given that SFPQ has not been documented as an IMPα cargo, further investigation would be required to determine if these effects are a result of direct or indirect actions of IMPα. Importantly, all SFPQ paraspeckle parameters are significantly influenced by the IMPα4 isoform (but not by IMPα2 and IMPα6, for which no individual foci parameters were affected). This suggests a unique functional relationship exists between SFPQ and IMPα4 that facilitates SFPQ nuclear import and paraspeckle localization. IMPα4 over-expression does not increase exogenous PSPC1 nuclear accumulation, but increases DsRed2-PSPC1 nuclear foci numbers, indicative of higher paraspeckle numbers in each cell. We hypothesize that IMPα4 over-expression mediates paraspeckle enlargement, potentially through the elevation of SFPQ in paraspeckles, thereby stabilizing NEAT1 RNA17, and enabling higher levels of PSPC1 recruitment and accumulation into paraspeckles.

These findings will be of particular importance in developmental systems in which IMPα levels are dynamically regulated and paraspeckles or components thereof are also present. We previously showed that IMPα2 expression peaks in the embryonic mouse testis (E12.5) and the adult mouse testis at developmental stages overlapping with PSPC1 expression49. NEAT1 transcripts also increase during muscle differentiation from myoblasts into myotubes, when paraspeckles are documented as enlarged and present in greater numbers12. This observation is interesting given that regulated expression of the nuclear transport machinery has also been implicated in muscle differentiation, with increasing IMPα2 linked to myoblast proliferation, myocyte migration and myotube size46.

IMPα2 expression has been identified as a prognostic marker of poor outcome in many cancers58, including those in which the long non-coding paraspeckle RNA NEAT1 has been independently implicated, including breast59,60,61,62, colon63, liver64 and lung65,66. The link identified here between functional IMPα levels and the nuclear accumulation and localization of PSPC1 and SFPQ to paraspeckles leads us to speculate that enhanced paraspeckle formation and function may affect prognostic outcomes and provide therapeutic targets in oncology. The automated image analysis pipeline allowed for non-subjective, comprehensive examination of subcellular features on a mass scale, with the number of cells analysed extending far beyond what is feasible with manual analysis. This adaptable, high-throughput analysis pipeline could be used to answer other research questions requiring quantification of subtle changes at subcellular levels or larger imaging scales. Within the Imaris cells module, the object named “vesicles” can be used to identify spots or foci, while the “nucleus” and “cell” components will identify larger objects. These three object types do not have to be cells, nuclei or vesicles; they could be anything, micro or macro, that is identifiable by intensity thresholding. Because the parameters from these object types are linked hierarchically within Imaris, the diversity of outputs, and information about their inter-relationships, is extensive. Furthermore, custom parameters can be achieved by those with programming knowledge by creating Imaris plug-ins (XTensions) or calculating them from existing Imaris outputs within the R environment for statistical computing. As imaging techniques advance and larger 3D data sets can be acquired in shorter time frames, automated analysis pipelines such as this, which allow subtle subcellular events to be rigorously interrogated across many thousands or millions of cells, will deepen our understanding of fundamental cellular processes.

Materials and Methods

Constructs

GFP-tagged IMPα constructs for mammalian cell expression were generated previously44,50,55 and encoded full length IMPα variants (GFP-IMPα2-FL, GFP-IMPα4-FL, GFP-IMPα6-FL), ED mutants (GFP-IMPα2-ED) and truncated dominant negative IMPαs (GFP-IMPα2ΔIBB, GFP-IMPα4ΔIBB, GFP-IMPα6ΔIBB). The murine PSPC1 sequence (encoding aa 3–523) was amplified by PCR and recombined into a DsRed2-tagged mammalian cell expression vector using the Invitrogen Gateway System, as previously described49.

Cell culture, transfection and indirect immunofluorescent staining

HeLa cells were maintained in Dulbecco’s modified eagle medium with 10% (v/v) fetal calf serum, Penicillin-Streptomycin (Pen-Strep), L-Glutamine and MEM Non-Essential Amino Acids in 5% CO2 at 37 °C. Twenty-four hrs prior to transfection, cells were seeded on round coverslips in medium lacking Pen-Strep in 12 well plates for siRNA knockdown or 24 well plates for GFP/RFP-tagged construct transfections. Lipofectamine 2000 (Invitrogen) was used to transfect PSPC1 and IMPα2 constructs, following the manufacturer’s method with 2.5 μg of DNA (single plasmid or 1.25 μg of each for co-transfection). The Dharmacon ON-TARGETplus siRNA system (GE Life Sciences) with DharmaFECT 1 transfection reagent was used as per manufacturer’s instructions. Pre-designed siRNAs targeting IMPα2 (SMARTpool L-004702-00) and IMPα4 (SMARTpool L-017477-00) were used, with a non-targeting (SCRAM siRNA) control pool (D-001810-10) as the siRNA negative control.

At 48 hrs post transfection, cells were fixed in 3.2% paraformaldehyde (in PBS) for 10 min and washed (2 × 5 min, PBS) before proceeding to indirect immunofluorescence staining, as previously49. To detect endogenous mouse PSPC1 and SFPQ, mouse monoclonal antibodies specific to SFPQ and to the longer PSPC1 isoform were used67. Rabbit anti-IMPα2 (Abcam, cat#ab84440) and goat anti-IMPα4 (Abcam, cat#ab6039) were used to detect IMPα2 and IMPα4, respectively, for immunofluorescence. Primary antibodies (1:100 in 0.5% BSA/PBS) were applied overnight at 4 °C. Secondary antibodies, rabbit anti-mouse Alexa Fluor 546 (Molecular Probes-Invitrogen, cat#A11060) for GFP/RFP-tagged transfections and donkey anti-mouse Alexa Fluor 488 (Molecular Probes-Invitrogen, cat#A21202) plus goat anti-rabbit Alexa Fluor 546 (Molecular Probes-Invitrogen, cat#A11010) or rabbit anti-goat Alexa Fluor 546 (Molecular Probes-Invitrogen, cat#A21085) for siRNA knockdown samples (1:200 in 0.5% BSA/PBS), were applied for 90 mins at room temperature.

HeLa cell image acquisition

Imaging was performed using a Leica SP5 laser scanning confocal system (DMI6000 microscope, motorised stage, 63 × water/glycerol objective, Monash Micro Imaging Facility). Images were collected as Z-series and tiled in a 7 × 7 field of view grid (coverage of approximately 1.7 mm2), with resonant scanning mode (8000 Hz) used for siRNA samples (coverage of approximately 0.9 mm2).

Imaris-assisted image analysis to detect cells, nucleus and paraspeckles

To assess paraspeckle number, size and PSPC1 intensity within each cell, the Imaris software package “Cells” module (Bitplane, Version 8) was used to batch process identification of cells, their nuclei, and paraspeckles, within the larger image sets described above (as shown in Fig. 1Bi–viii). Throughout the GFP-tagged IMPα transient transfection experiments, Draq5 signal identified the nucleus, GFP signal was used to identify the cell body, and nuclear foci were identified using the particular paraspeckle marker signal under investigation (i.e. PSPC1, SFPQ or DsRed2-PSPC1). For siRNA samples, DAPI signal identified the nucleus, IMPα (IMPα2 or IMPα4) signal identified the cell body, and nuclear foci were identified using the paraspeckle marker signal (PSPC1 or SFPQ). The results (output in CSV file formats) were combined and manipulated using Python scripts (Python Software Foundation, version 2.7), then analysed using the R Project for Statistical Computing scripts (The R Foundation, version 3.2). Incomplete cells with no nucleus or their nucleus on the very edge of an image (X, Y or Z image planes) were excluded from datasets for analysis and GFP thresholding was applied to datasets as described for the GFP-tagged IMPα transient transfection experiments. Additional R packages used for analysis were “car”68, “epitools”69., “geepack”70, “ggplot2”71. Graphs presented in Fig. 2 were generated using Prism (GraphPad Software, Version 6), while all others were generated using R and the “ggplot2” package.

Statistical Analysis

For statistical testing, individual cells were assumed to be independent, but paraspeckles within each cell were assumed to be correlated. When analysing the individual cell or paraspeckle data, three outcome types were generated: 1) binary responses based on whether or not a cell was positive for paraspeckles, 2) counts data based on the number of paraspeckles within each cell (including/excluding zeroes) and 3) continuous data based on paraspeckle volume sum and paraspeckle PSPC1 intensity sum.

Comparisons between groups were made using generalised linear models (GLM); logistic regression for the binary data, linear regression for the count and continuous data. As the count and continuous data were both skewed, data were transformed using the natural logarithm to allow valid statistical inference from the linear regression models. The p-values are based on the transformed data; however, the results were then back-transformed to give estimates in the original scale for ease of interpretation. By taking the exponent of the mean of log-transformed data, the geometric mean and confidence intervals (CIs) were obtained on the original linear scale. By taking the exponent of the linear regression coefficients obtained on the log-transformed scale, the ratio of the geometric means and their 95% CIs were obtained on the original scale. Odds ratios are given for logistic regression results. When assessing data on a per paraspeckle basis, continuous outcomes were examined, which again required log transformations. Generalised estimating equations (GEE) were used to enable correlation between paraspeckles originating from the same cell72.

Additional Information

How to cite this article : Major, A. T. et al. Development of a pipeline for automated, high-throughput analysis of paraspeckle proteins reveals specific roles for importin α proteins. Sci. Rep. 7, 43323; doi: 10.1038/srep43323 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.