Brief Communication | Published:

A discriminative learning approach to differential expression analysis for single-cell RNA-seq

Nature Methodsvolume 16pages163166 (2019) | Download Citation


Single-cell RNA-seq makes it possible to characterize the transcriptomes of cell types across different conditions and to identify their transcriptional signatures via differential analysis. Our method detects changes in transcript dynamics and in overall gene abundance in large numbers of cells to determine differential expression. When applied to transcript compatibility counts obtained via pseudoalignment, our approach provides a quantification-free analysis of 3′ single-cell RNA-seq that can identify previously undetectable marker genes.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Code availability

The code required to conduct the simulations and reproduce the analyses is available at We also have provided the Github repository that was zipped at the time of manuscript acceptance as Supplementary Software.

Data availability

The myogenesis dataset (Trapnell et al.10) is available on the conquer database and on GEO as series GSE52529. The dataset on embryogenesis is available on the conquer database (Petropoulos et al.22). The 10x PBMC dataset is available from the 10x Genomics Support website19.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Soneson, C. & Robinson, M. D. Nat. Methods 15, 255–261 (2018).

  2. 2.

    Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Nat. Methods 11, 740–742 (2014).

  3. 3.

    Finak, G. et al. Genome Biol. 16, 278 (2015).

  4. 4.

    Yamazaki, T. et al. Genes Dev. 32, 1161–1174 (2018).

  5. 5.

    Vitting-Seerup, K. & Sandelin, A. Mol. Cancer Res. 15, 1206–1220 (2017).

  6. 6.

    Arzalluz-Luque, Á. & Conesa, A. Genome Biol. 19, 110 (2018).

  7. 7.

    Gupta, I. et al. bioRxiv Preprint at (2018).

  8. 8.

    Xing, E. P., Jordan, M. I. & Karp, R. M. in ICML01 Proceedings of the Eighteenth International Conference on Machine Learning (eds Brodley, C. E. & Pohoreckyj Danyluk, A.) 601–608 (Morgan Kaufmann, San Francisco, 2001).

  9. 9.

    Shevade, S. K. & Keerthi, S. S. Bioinformatics 19, 2246–2253 (2003).

  10. 10.

    Trapnell, C. et al. Nat. Biotechnol. 32, 381–386 (2014).

  11. 11.

    Zheng, G. X. et al. Nat. Commun. 8, 14049 (2017).

  12. 12.

    Macosko, E. Z. et al. Cell 161, 1202–1214 (2015).

  13. 13.

    Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Nat. Biotechnol. 34, 525–527 (2016).

  14. 14.

    Nicolae, M., Mangul, S., Măndoiu, I. I. & Zelikovsky, A. Algorithms Mol. Biol. 6, 9 (2011).

  15. 15.

    Ntranos, V., Kamath, G. M., Zhang, J. M., Pachter, L. & Tse, D. N. Genome Biol. 17, 112 (2016).

  16. 16.

    Yi, L., Pimentel, H., Bray, N. L. & Pachter, L. Genome Biol. 19, 53 (2018).

  17. 17.

    Peterson, V. M. et al. Nat. Biotechnol. 35, 936–939 (2017).

  18. 18.

    Byrne, A. et al. Nat. Commun. 8, 16027 (2017).

  19. 19.

    10x Genomics. Single cell gene expression datasets. 10x Genomics Support (2018).

  20. 20.

    Wolf, F. A., Angerer, P. & Theis, F. J. Genome Biol. 19, 15 (2018).

  21. 21.

    Bradley, R. K. et al. PLoS Comput. Biol. 5, e1000392 (2009).

  22. 22.

    Petropoulos, S. et al. Cell 165, 1012–1026 (2016).

  23. 23.

    Conway, J. R., Lex, A. & Gehlenborg, N. Bioinformatics 33, 2938–2940 (2017).

  24. 24.

    Love, M. I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).

  25. 25.

    Li, B. & Dewey, C. N. BMC Bioinformatics 12, 323 (2011).

  26. 26.

    Zappia, L., Phipson, B. & Oshlack, A. Genome Biol. 18, 174 (2017).

  27. 27.

    Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Nat. Biotechnol. 33, 495–502 (2015).

  28. 28.

    Soneson, C., Love, M. I. & Robinson, M. D. F1000Res. 4, 1521 (2015).

Download references


We thank N. Bray, J. Gehring and V. Svensson for discussion and comments on the manuscript, and H. Pimentel for assisting with the simulations. We thank A. Butler and R. Satija for implementing this method in Seurat. V.N., L.Y. and L.P. are partially funded by NIH R012017-0569.

Author information

Author notes

  1. These authors contributed equally: Vasilis Ntranos, Lynn Yi.


  1. Department of Electrical Engineering & Computer Science, UC Berkeley, Berkeley, CA, USA

    • Vasilis Ntranos
  2. Department of Electrical Engineering, Stanford University, Stanford, CA, USA

    • Vasilis Ntranos
  3. UCLA–Caltech Medical Science Training Program, UCLA, Los Angeles, CA, USA

    • Lynn Yi
  4. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA

    • Lynn Yi
    •  & Lior Pachter
  5. Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavík, Iceland

    • Páll Melsted
  6. Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA

    • Lior Pachter


  1. Search for Vasilis Ntranos in:

  2. Search for Lynn Yi in:

  3. Search for Páll Melsted in:

  4. Search for Lior Pachter in:


V.N. developed the model during discussions with L.Y. and L.P, and analyzed the 10x PBMC dataset. L.Y. performed the simulations and analyzed the embryo SMART-Seq dataset. P.M. developed kallisto genomebam and assisted with analysis. All authors contributed extensively to the interpretation of the results and writing of the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Lior Pachter.

Supplementary information

About this article

Publication history




Issue Date


Further reading