Elucidation of the identity and diversity of mechanisms that sustain long-term human blood cell production remains an important challenge. Previous studies indicate that, in adult mice, this property is vested in cells identified uniquely by their ability to clonally regenerate detectable, albeit highly variable levels and types, of mature blood cells in serially transplanted recipients. From a multi-parameter analysis of the molecular features of very primitive human cord blood cells that display long-term cell outputs in vitro and in immunodeficient mice, we identified a prospectively separable CD33+CD34+CD38−CD45RA−CD90+CD49f+ phenotype with serially transplantable, but diverse, cell output profiles. Single-cell measurements of the mitogenic response, and the transcriptional, DNA methylation and 40-protein content of this and closely related phenotypes revealed subtle but consistent differences both within and between each subset. These results suggest that multiple regulatory mechanisms combine to maintain different cell output activities of human blood cell precursors with high regenerative potential.
Access optionsAccess options
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors thank M. Hale, G. Edin and the British Columbia Cancer Agency Stem Cell Assay Laboratory staff for technical assistance including the initial processing of CB samples. This work was supported by a Terry Fox Foundation New Frontiers Program Project, a Stem Cell Network of Centres of Excellence grant and grants from the Canadian Institutes of Health Research (CIHR). D.J.H.F.K. held a CIHR Vanier Scholarship. C.A.H. and P.H.M. held a CIHR Frederick Banting and Charles Best Doctoral Scholarship. DVS Sciences provided the Palladium barcoding reagents used in the mass cytometry experiments.
Integrated supplementary information
Supplementary Figure 1 Data normalization for cross-platform comparison of flow and mass cytometric data.
(a) Gating hierarchy for cells in the CD49f CB population using mass cytometry. (b) Raw data (left) from 3 independent index sort FACS experiments are shown in blue, magenta and aquamarine, and from mass cytometry experiments in grey. Rank-normalized data (used for determining nearest-neighbours for sorted cells in the mass cytometry data) are shown in the middle panels. The right panels show mean-scaled data (channel z-score) for the flow cytometry data only (used in surface marker comparisons between experiments).
Supplementary Figure 2 Additional marker intensities of the nearest neighbours of index-sorted 49f CB cells with different proliferative activities in LTC.
All surface (a) and intracellular (b) markers that were not in the top 5 are shown here (See Figure 2d for the top 5 marks and 2e for exact p values determined using the Kruskal-Wallis rank sum test, n=1,614 nearest neighbours in total). Differences that did not reach significance are marked with "NS". Lines indicate median values. Values represent asinh(marker intensity/5).
Supplementary Figure 3 Logistic regression classifier guided gating for 49f CB cells that are highly proliferative in LTC.
(a) A receiver-operator curve for a logistic regression classifier trained to detect 49f CB cells with high proliferative potential is shown (created using the R library ‘ROCR’). Classifier variables were selected using a step-wise procedure to minimize information loss while minimizing the number of markers (using the R function ‘step’). (b) Threshold selection based on the F-measure (harmonic mean of precision and recall). A red point indicates the selected threshold. (c) Plot coordinates of all analyzed 49f CB cells over all combinations of the 3 markers (CD34, CD90, and CD10) included in the final model. Marker values represent the mean-scaled fluorescence values (channel Z-score). Colour coding indicates proliferative capacity as in Figure 2 (n=538). Blue circles indicate cells selected based on the logistic regression model. (d) Mean-scaled surface marker intensities for each input cell from the barcode tracking experiments (n=61) are plotted as a function of the total number of cells it produced at the time of sacrifice of the primary mice. Colour coding of clone lineage composition is indicated in the key in Figure 4. (e) PCA performed based on surface marker intensities (mean-scaled) with point sizes proportional to corresponding clone sizes. (f) Mapping of the barcoded clones to the t-SNE mass cytometric distribution for the 49f CB compartment. As in Figure 2c, nearest neighbours for all members of a given initial cell type were pooled and used to generate a probability density, indicated by colour intensity. The lowest level contains 95% of the total probability density with levels at each 10% mark thereafter. Point estimates for each input cell represent the median t-SNE value of their nearest neighbours in each t-SNE dimension. Point size is shown as proportional to the size of the clone it identifies assessed at the 30+ week post-transplant time of sacrifice of the primary mice. A contour showing the boundary containing 50% of the probability density of the highly proliferative input cells detected in the LTC assays (Figure 2c) is shown in black.
FACS gating hierarchies are shown for experiment 1 (a) and 2 (b), respectively. Gates based on mass cytometric data are shown in (c).
All surface (a) or intracellular (b) markers that were not in the top 5 are shown here. See Figure 5b for exact p values (n=2713 cells total). Top 5 marks are shown in Figure 4. Differences that did not reach significance (Holm-corrected Kruskal-Wallis rank sum test) are marked with "NS". Lines indicate median values. Values represent asinh(marker intensity/5).
(a) Percent of reads for each single-cell library mapped to ribosomal (left), mitochondrial (middle), or spike-in control RNA (right). Median values are shown as dashed lines. Library size and number of expressed genes per cell are shown in (b) and (c), respectively (n= 432 cells from 2 separate donors). Median values are shown as dashed lines. (d) Distribution of gene expression values (n= 58,143). The dashed blue line indicates the minimum threshold for inclusion. (e) Library size by computed size factor for data normalization (n=411 cells from 2 separate donors). (f) Gene dispersion as a function of average expression (n=411 cells from 2 separate donors). Genes included for PCA upstream of t-SNE are labelled.
(a) Distribution of CpG coverage for negative controls (n=6), single cells (n=140), and 10-cell controls (n=12). Alignment rate per cell is indicated by colour. For all box plots, the box shows the interquartile range, and whiskers extend to the furthest data point no more than 1.5x the total box length from the edge of the box. (b) Measured global methylation levels per cell for the CD33+CD90+ subset of 49f CB cells, and all other 49f CB cells (n=136 single cells, 12 ten-cell controls, 2 separate CB donors). Conversion rates for single cells and 10-cell controls are shown for the positive (Lambda DNA) and negative (T7 DNA) controls in (c) and (d), respectively (n=140 single cells, 12 ten-cell controls, 2 separate donors). Boxes-and-whisker plots are as in (a). (e) CNV in autosomal chromosomes (n=140 single cells, 2 separate donors). The threshold for calling a cell abnormal is shown as a solid line. (f) Number of genes associated with each DMR that was hypomethylated in the CD33+CD90+ subset (left, n=11,498 regions) or any other 49f CB cell (right, n=15,926 regions). (g) Distance to the associated transcriptional start sites (TSS) for DMRs associated with genes determined using rGREAT (n=4,725 and 5,925 regions for CD33+90+ and other 49f CB cells, respectively)
Supplementary Figures 1–7, Supplementary Table legends.
Antibodies used in the mass cytometric analyses.
Details of transplanted index-sorted 49f CB cells.
Antibodies used in index sorts of 49f CB subsets.
Antibodies used to subdivide the 49f CB population.
Human-specific antibodies used to analyse mice injected with lenti-virally barcoded, index-sorted 49f CB cells.
Antibodies used in index sorts of 49f CB subsets for single-cell RNA-Seq analyses.
Primers used for RNA sequencing.
Antibodies used in index sorts of 49f CB subsets for single-cell PBAL analyses.
Statistics source data.