Main

Single-cell transcriptomics has emerged in recent years as a powerful tool in medical research to study cell-to-cell variability (Saliba et al., 2014). This technology is very appealing in the study of the ecophysiology of microbial eukaryotes. Many organisms of interest are not in culture, so large numbers of cells are not available for transcriptomic analyses. Even for those in culture, it would be interesting to learn their gene expression in situ. Single-cell transcriptomics offers the ability to target organisms of interest from environmental samples, therefore not wasting sequencing capacity on non-target taxa. It also provides a means of obtaining genetic information of several co-occurring organisms in microbial communities without the need to bin sequences or align to reference genomes like in metatranscriptome studies. Kolisko et al. (2014) described the first successful test of single-cell RNA-seq for microbial eukaryotes. They reported that transcriptome coverage from single cells was comparable to those of culture-based transcriptomes for five different ciliates with sizes ranging from 50 to 500 μm. However, microbial eukaryotes have a wide range of sizes and can be as small as 1 μm (Caron et al., 2009). The feasibility of single-cell RNA-seq in smaller microbial eukaryotes remains unknown. Here we describe results that transcript recovery rate using single-cell RNA-seq was significantly limited in two small microbial eukaryotic organisms. We estimated that these smaller organisms contained only thousands to tens of thousands of total mRNA molecules per cell. We discuss the application of single-cell RNA-seq in small microbial eukaryotes in the context of these limitations.

Single-cell and culture-based transcriptomes of two microbial eukaryotes, the dinoflagellate Karlodium veneficum (cell length of ~15 μm) and the haptophyte Prymnesium parvum (cell length of ~8 μm), were sequenced, assembled and compared. The assembled transcriptomes contained 63 184 and 38 704 transcripts for K. veneficum and P. parvum, respectively. Most of these transcripts were detected in the culture-based transcriptomes. In comparison, only ~15% of the transcripts were detected in the transcriptomes of single cells of K. veneficum on average, while the average transcript recovery rate was ~3% for smaller P. parvum single cells (Table 1). These rates were much lower than those documented for ciliate species of larger size (80–100%; Kolisko et al., 2014). When single-cell data were combined, transcriptomes summed from 10 K. veneficum cells recovered two-thirds of the transcripts observed in the cultured-based metatranscriptome, while transcripts summed from 18 P. parvum cells recovered less than one-third (Figure 1). Lower gene recovery rate was also reported by Kolisko et al. (2014) for their smallest cell (Tetrahymena thermophile, ~50 μm) mainly because >90% of its reads were from one rRNA contig. No such bias was observed in this study as reads from all rRNA contigs combined, or the most represented contig never accounted for >12% of total reads in any single cell sample.

Table 1 Summary of single cell and culture based transcriptomes of K. veneficum and P. parvum, and estimations of mRNA molecules per cell in the two species
Figure 1
figure 1

Heatmap of expression levels (in the form of Log2 of FPKM values) of K. veneficum (a) and P. parvum (b) transcripts in culture and single cells. Transcripts were grouped by presence/absence in single cells. Transcripts detected in multiple cells were arranged by hierarchical clustering of expression patterns among all samples.

In addition to low transcript recovery rates, we also observed much larger variability among single-cell transcriptomes than typically observed in mammalian studies. Approximately half of the transcripts detected in single-cell transcriptomes were only detected in one cell. Very few transcripts (220 K. venificum and 18 P. parvum transcripts), usually those with highest expression levels in the culture-based transcriptomes, were detected in all single cells. Among transcripts detected in multiple cells, expression levels often varied markedly between different cells (Figure 1). Many known housekeeping genes such as those encoding ribosomal proteins were not detected in many cells. Despite the large cell-to-cell variability on the gene level, the collective expression levels of major pathways and functions were very similar across different cells, except in one cell (cell #9) with extremely low transcript recovery rate (Supplementary Figure 1). These results suggested that the observed differences in single-cell transcriptomes were unlikely the reflection of physiological differences among cells, but rather of elevated stochasticity on the level of individual genes.

Both low transcript recovery rate and high gene-level variability could result from relatively low RNA content per cell. Single-cell RNA-seq has been applied successfully in human cells, which are estimated to contain 50 000–300 000 mRNA molecules per cell (Marinov et al., 2014). On the other hand, it is considered not suitable for bacteria (Taniguchi et al., 2010), which have only 200–2000 mRNA molecules per cell (Moran et al., 2013). Numbers of mRNA molecules per cell in K. veneficum and P. parvum were estimated using two methods, based on either total RNA extraction amounts or RNA spike-in standards. Results from both methods were similar. K. veneficum and P. parvum contained ~51 000 and ~4 800 mRNA molecules per cell on average, respectively (Table 1). These mRNA copy numbers limited the inventory of transcripts that these two organisms could possibly carry at any particular time. Our in silico simulations showed that mRNA copy numbers fewer than 100 000 could significantly limit transcript recovery rate in these two organisms (Supplementary Figure 2A). In microbial eukaryotes of similar sizes, which probably have similar mRNA copy numbers, low transcript recovery rate per cell can be expected, unless they have very small genomes.

Gene transcription generally occurs in stochastic bursts (Golding et al., 2005; Suter et al., 2011), and single-cell transcriptomes of cells with relatively few mRNA molecules are much more susceptible to biological and technical stochasticity (Marinov et al., 2014). Because of the small mRNA copy numbers in the two species examined in this study, it was doubtful that the single-cell gene expression levels and transcript presence/absence in different cells were reliable. Average expression levels of transcripts in single cells had no correlation with those in cultures in both organisms, except for transcripts with extremely high expression levels (Supplementary Figure 2C and D). In human cells, 30–100 single cells are needed to reliably measure gene expression levels (Marinov et al., 2014). In small microbial eukaryotes, many more cells would be needed to achieve the same goal. Caution is warranted when interpreting single-cell transcriptome comparisons of different samples, especially if the cells are small.

Our data illustrated a simple but important concept: when using single-cell transcriptomics with microbes, size matters. Less efficient gene discovery and higher stochasticity in gene expression levels should be expected when designing experiments using single-cell transcriptomics on smaller microbial eukaryotes. However, such limitations should in no way discourage the application of the technology in studying these organisms. A simple solution exists: combining multiple single-cell transcriptomes of the same organism. Our simulations showed that, in cells with mRNA copy numbers similar to K. veneficum, 25 cells combined should recover most transcripts. In smaller protists such as P. parvum, more than 100 cells are likely needed (Supplementary Figure 2B). With this in mind, we tested microfluidic single-cell RNA-seq of P. parvum because of its demonstrated ability to capture a large number of single cells quickly (Wu et al., 2014). However, our test was less successful than anticipated for this species (we obtained nine single-cell transcriptomes out of 96 wells), presumably because P. parvum was smaller than smallest designed cell size (10 μm) of any chip available at the time. The transcriptomes obtained were similar to those obtained from manually isolated cells (Figure 1b). We believe that with some optimization, high-throughput single-cell transcriptomics of microbial eukaryotes should be achievable in the near future.

Single-cell transcriptomics has already been used to advance our knowledge of microbial eukaryotes (for example, Balzano et al., 2015 and Gravelis et al., 2015). Undoubtedly, it will continue to shine as a powerful tool in studying microbial eukaryotes in nature, large and small, especially when gene discovery is still one of the main goals in the field (Keeling et al., 2014).