Diffusion pseudotime robustly reconstructs lineage branching

Abstract

The temporal order of differentiating cells is intrinsically encoded in their single-cell expression profiles. We describe an efficient way to robustly estimate this order according to diffusion pseudotime (DPT), which measures transitions between cells using diffusion-like random walks. Our DPT software implementations make it possible to reconstruct the developmental progression of cells and identify transient or metastable states, branching decisions and differentiation endpoints.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Diffusion pseudotime reveals temporal ordering and cellular decisions on the single cell level.
Figure 2: Diffusion pseudotime identifies differentiation dynamics from scRNA-seq data9.

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. 1

    Shalek, A.K. et al. Nature 498, 236–240 (2013).

    CAS  Article  Google Scholar 

  2. 2

    Moignard, V. et al. Nat. Cell Biol. 15, 363–372 (2013).

    CAS  Article  Google Scholar 

  3. 3

    Treutlein, B. et al. Nature 509, 371–375 (2014).

    CAS  Article  Google Scholar 

  4. 4

    Magwene, P.M., Lizardi, P. & Kim, J. Bioinformatics 19, 842–850 (2003).

    CAS  Article  Google Scholar 

  5. 5

    Trapnell, C. et al. Nat. Biotechnol. 32, 381–386 (2014).

    CAS  Article  Google Scholar 

  6. 6

    Bendall, S.C. et al. Cell 157, 714–725 (2014).

    CAS  Article  Google Scholar 

  7. 7

    Setty, M. et al. Nat. Biotechnol. 34, 637–645 (2016).

    CAS  Article  Google Scholar 

  8. 8

    Macosko, E.Z. et al. Cell 161, 1202–1214 (2015).

    CAS  Article  Google Scholar 

  9. 9

    Klein, A.M. et al. Cell 161, 1187–1201 (2015).

    CAS  Article  Google Scholar 

  10. 10

    Paul, F. et al. Cell 163, 1663–1677 (2015).

    CAS  Article  Google Scholar 

  11. 11

    Coifman, R.R. et al. Proc. Natl. Acad. Sci. USA 102, 7426–7431 (2005).

    CAS  Article  Google Scholar 

  12. 12

    Haghverdi, L., Buettner, F. & Theis, F.J. Bioinformatics 31, 2989–2998 (2015).

    CAS  Article  Google Scholar 

  13. 13

    Moignard, V. et al. Nat. Biotechnol. 33, 269–276 (2015).

    CAS  Article  Google Scholar 

  14. 14

    Huber, T.L., Kouskoff, V., Fehling, H.J., Palis, J. & Keller, G. Nature 432, 625–630 (2004).

    CAS  Article  Google Scholar 

  15. 15

    Costa, G., Kouskoff, V. & Lacaud, G. Trends Immunol. 33, 215–223 (2012).

    CAS  Article  Google Scholar 

  16. 16

    Finak, G. et al. Genome Biol. 16, 278 (2015).

    Article  Google Scholar 

  17. 17

    Gut, G., Tadmor, M.D., Pe'er, D., Pelkmans, L. & Liberali, P. Nat. Methods 12, 951–954 (2015).

    CAS  Article  Google Scholar 

  18. 18

    Angerer, P. et al. Bioinformatics 32, 1241–1243 (2016).

    CAS  Article  Google Scholar 

  19. 19

    von Luxburg, U. Stat. Comput. 17, 395–416 (2007).

    Article  Google Scholar 

  20. 20

    Buettner, F. et al. Nat. Biotechnol. 33, 155–160 (2015).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge C. Marr, J. Hasenauer, M. Heinig, J. Krumsiek, T. Blasi and P. Angerer for their helpful advice and comments on the manuscript. M.B. is supported by a DFG Fellowship through the Graduate School of Quantitative Biosciences Munich (QBM). F.A.W. acknowledges support from the Helmholtz Postdoc Programme, Initiative and Networking Fund of the Helmholtz Association. F.B. is supported by the UK Medical Research Council (MRC) via a Career Development Award (MR/M01536X/1). F.J.T. acknowledges financial support by the German Science Foundation (SFB 1243 and Graduate School QBM) as well as by the Bavarian government (BioSysNet).

Author information

Affiliations

Authors

Contributions

L.H. developed the method and the computational tools, performed the analysis and wrote the paper and the supplement. M.B. contributed to the analysis and biological interpretation of results and wrote the supplement. F.A.W. helped interpret the results and write the supplement, and he wrote the paper. F.B. helped interpret the results. F.J.T. conceived and supervised the study, contributed to the method development and wrote the paper with help from all coauthors.

Corresponding author

Correspondence to Fabian J Theis.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 metastable states of mouse early blood development qPCR data

a) Diffusion map plot illustrating four metastable states along pseudotemporal ordering. Lower right: Precursor state. Left: Tip branch 1. Upper right: Decision state (light gray) and tip branch 2 (dark gray). b) Histogram plot of the cell density along the branches. Blue bars: branch 1, black bars: branch 2. Both branches share the precursor branch up to the decision state (gray bars).

Supplementary Figure 2 Differential expression analysis using MAST on mESC inDrop data

Log-fold change (lfc) analysis of the DPT inferred ‘decision’ group vs. all other groups (a,c,e) and head fold cells vs. primitive streak and 4SG- cells (d, e). The displayed genes were filtered for an lfc > 1 and a Bonferroni-adjusted p-value< 0.01. Plots are ordered by absolute lfc between the states. a) Decision area (red) vs. Precursor area (blue), b) Head fold (red) vs. Primitive streak (blue), c) Decision area (red) vs. branch 2 end point (blue), d) Head fold (red) vs. 4SG negative cells (blue), e) Decision area (red) vs. branch 1 end point (blue).

Supplementary Figure 3 Influence of cell-cycle correction on data clustering and GO enrichment

a,b) The total count of transcripts from 2047 heterogeneous genes per day. a) log-normalized counts before cell-cycle correction. b) log-normalized counts after cell-cycle correction. c) Fit the CV2-mean relation according to Brennecke et al [11] to a pure RNA control and d) superimpose these technical genes with endogenous genes. e) Variance decomposition according to the identified latent variables. f) Detailed variance decomposition sorted by technical noise contribution.

Supplementary Figure 4 Expression profiles of highly variable genes before and after cell-cycle correction and pseudotime ordering of mESC inDrop data

Heatmap displaying the expression profiles of 2047 highly variable genes before a) and after cell-cycle correction and pseudotime ordering (b,c), time courses of gene expression along batch (d) and pseudotime (e,f), GO enrichment analysis of the clusters in (a,c). The colored top bar (a-c) indicates the time after LIF withdrawal (dark blue: day 0, light blue: day 2, yellow: day 4, red: day 7). a) Gene expression with strong day-to-day variability. b) Cell-cycle corrected gene expression and additional quantile normalization. c) Cell-cycle corrected gene expression and additional Z-score normalization. Pseudotemporal ordering is indicated by mixed colors in the top annotation bar. In the time courses, the respective genes are indicated in grey, the black curve is the smoothed mean. d) log-transformed gene expression counts. e) Cell cycle correction, log transformed gene expression counts, quantile normalization (cf. Fig. 2d in main text). f) As in E), with Z-score normalization. All clusters share the same temporal behavior. The green cluster GO terms are not shown. For each cluster, five representative GO terms are displayed. g) GO terms before cell-cycle correction, h) after cell-cycle correction and Z-score normalization. i) Distribution of cells along pseudotime labeled by time after LIF withdrawal.

Supplementary Figure 5 p-values of Wilcoxon rank sum test applied to the first population that branches off the main branch in mESC inDrop data

Shown are the 20 apoptosis-related genes (GO:0006915) among the 108 genes identified by Wilcoxon rank sum test. The test compares cells from the early state population (see text) with cells from the first population that branches off the main branch.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–5, Supplementary Notes 1–7, and Supplementary Tables 1 and 2 (PDF 6893 kb)

Supplementary Data

Processed inDrop data set (ZIP 228768 kb)

Supplementary Software

DPT Software in Matlab and R (ZIP 3870 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Haghverdi, L., Büttner, M., Wolf, F. et al. Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods 13, 845–848 (2016). https://doi.org/10.1038/nmeth.3971

Download citation

Further reading