A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages

We introduce a microfluidic platform that enables off-chip single-cell RNA-seq after multi-generational lineage tracking under controlled culture conditions. We use this platform to generate whole-transcriptome profiles of primary, activated murine CD8+ T-cell and lymphocytic leukemia cell line lineages. Here we report that both cell types have greater intra- than inter-lineage transcriptional similarity. For CD8+ T-cells, genes with functional annotation relating to lymphocyte differentiation and function—including Granzyme B—are enriched among the genes that demonstrate greater intra-lineage expression level similarity. Analysis of gene expression covariance with matched measurements of time since division reveals cell type-specific transcriptional signatures that correspond with cell cycle progression. We believe that the ability to directly measure the effects of lineage and cell cycle-dependent transcriptional profiles of single cells will be broadly useful to fields where heterogeneous populations of cells display distinct clonal trajectories, including immunology, cancer, and developmental biology.


Supplementary Figure 4 | Comparison of Spearman distances between sister cells, cousin cells, and unrelated cells for CD8+ T cells (n = 43
, 73 and 4,544 respectively) and L1210 cells (n = 37, 60 and 3,064 respectively) as in Figure 2 for (a) the entire gene sets for both cell types (9,997 genes and 10,658 genes for CD8+ T cells and L1210, respectively) (b) a subset of genes with cell cycle related gene annotations (688 genes total, Supplementary Table 2) for CD8+ T cells and (c) a subset of genes with gene annotations related to T cell activation and function (142 genes total, Supplementary Table 3) for CD8+ T cells. Because Spearman distance is a rank-based metric, the observation weights determined for each cell/gene pair did not apply. Instead, the analysis was limited to genes with a mean expression level of ln(TPM+1) greater than 3 in order to reduce the noise associated with rank-ordering low-expression-level genes. Groups were compared with a Mann-Whitney U test. After Bonferroni correction: * p<0.05, ** p<0.01, ***p<0.001.  To determine the effect of pre-existing CD8+ T cell subsets on transcriptional similarity between cells, unrelated pairs of CD8+ T cells were ranked by the absolute value of the difference in their subset scores (Δ subset score) and the dataset was split in thirds to produce groups with high (top third, n = 1,514) and low (bottom third, n = 1,514) differences in subset scores. The Euclidean distances between cell pairs in each of these groups was then compared with a Mann-Whitney U test for (b) the entire gene list (9,997 genes) and (c) a subset of genes relating to T cell activation and function (142 genes, Supplementary Table 3). Both of these comparisons revealed that cells with a smaller difference in subset scores were more transcriptionally similar, with the effect appearing to be more pronounced in the subset of genes relating to T cell function.

Supplementary Figure 5
To reduce the effects of pre-existing CD8+ T cell subsets when comparing transcriptional similarity, Euclidean distance measurements of CD8+ sister (n = 43) and cousin (n = 73) cell pairs were compared to unrelated cell pairs with the smallest differences in subset scores (low Δ subset score, n = 1,514) for (d) the complete gene list (9,997 genes) and (e) for a subset of genes relating to T cell activation and function (142 genes, Supplementary Table 3). These results indicate that unrelated cell pairs with similar subset scores still show less transcriptional similarity than related cell pairs which suggests that, although pre-existing CD8+ T cell subsets may partially contribute to inter-lineage transcriptional variability, they are not the main drivers of differences in transcriptional similarity between related and unrelated cell pairs. After Bonferroni correction: * p<0.05, ** p<0.01, ***p<0.001.

Supplementary Figure 7
Supplementary Figure 7 | Comparing gene expression similarity in sister cells versus cousin cells. The difference in Spearman correlation coefficients for expression levels in sister cell pairs and cousin cell pairs (ρ diff = ρ sisters -ρ cousins ) was determined for each gene in (a) CD8+ T cells (9,997 genes total) and (b) L1210 cells (10,658 genes total). To determine genes that were expressed more similarly in sister or cousin cells, we defined an expected null distribution of ρ diff with a mean of zero and a standard deviation approximated as the average standard deviations of ρ diff we observed in these two cell types (0.15). We then defined the values of ρ diff corresponding to the top and bottom 1% of this distribution (± 0.349, vertical dashed lines) as the thresholds for the highest and lowest values of ρ diff to determine the genes that are more similarly expressed in sister cells or cousin cells, respectively. Both cell types demonstrated subsets of genes which were more similarly expressed in sisters as compared to cousins (149 genes and 288 genes for CD8+ and L1210 cells respectively, Supplementary Table 5). To determine the biological function of these genes, these lists were used for functional enrichment analysis for both cell types (Supplementary Table 5). Both cell types showed very few genes (<10) that were more similarly expressed in cousins as compared to sisters and these genes did not reveal any functional enrichment. This result is consistent with the positive skew observed for the ρ diff distributions for both L1210 and CD8+ T cells (γ = 0.294 and 0.318, respectively), which suggests that, for both cell types, there are more genes with a higher correlation between sister cell expression levels than between cousin cell expression levels.

Supplementary Figure 8
Supplementary Figure 8 | Comparison of transcriptional similarity between unrelated cells with differing cell cycle stage proximities. (a) Global gene expression-based Euclidean distances between unrelated pairs of L1210 and CD8+ T cells were ranked by their corresponding difference in times since division (Δt) and the dataset was split in thirds to produce groups with low (bottom third, n = 1,514 and 1,021 for CD8+ and L1210 cells, respectively) and high (top third, n= 1,514 and 1,021 for CD8+ and L1210 cells, respectively) Δt values corresponding to cells with more and less similar cell cycle stages, respectively. These groups were then compared with a Mann-Whitney U test. These results demonstrate that cell pairs with smaller differences in times since division are more transcriptionally similar.  analysis as in (a,b) applied to a subset of genes with gene annotations related to T cell activation and function (142 genes total, Supplementary Table 3) for CD8+ T cells. These comparisons show that the effect of cell cycle stage proximity on transcriptional similarity is less pronounced for genes related to T cell function than for cell cycle related genes in CD8+ T cells.
To reduce the effects of cell cycle stage differences in transcriptional similarity measurements, Euclidean distance measurements of sister (n = 43 and 37 for CD8+ and L1210 respectively) and cousin cell pairs (n = 73 and 60 for CD8+ and L1210 respectively) were compared to unrelated cell pairs that had a difference in times since division (Δt) of less than 2 hours (n = 1,006 and 495 for CD8+ and L1210 respectively) for (d) the entire gene list (9,997 genes and 10,658 genes for CD8+ T cells and L1210 respectively). (e) Same analysis as in (d) for genes relating to cell cycle progression in CD8+ T cells (688 genes total, Supplementary Table 2). (f) Same analysis as in (e) for genes relating to T cell activation and function (142 genes total, Supplementary Table 3). These results indicate that unrelated cell pairs with similar cell cycle stages (small Δt) still show less transcriptional similarity than related cells, which suggests that cell cycle proximity is not the main driver of differences in transcriptional similarity between related and unrelated cell pairs. After Bonferroni correction: * p<0.05, ** p<0.01, ***p<0.001.

Supplementary Figure 9
Supplementary Figure 9 | Plot of the coefficients of determination (R 2 ) between the first latent variable scores and single-cell measurements of time since division as a function of the number of VIP ranked genes included in the model for (a) CD8+ T cells and (b) L1210 cells. These R 2 values indicate the fraction of gene expression variance due to time since division that is explained by these models. The variance explained by the full model (black) and cross-validated model (red) are both included for comparison. The amount of variance explained by the model appears to plateau at around 300 genes for each cell type. For this reason, we used a subset of genes with the top 300 VIP scores for the final model construction presented in Figure 2 e, f.