Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

You are viewing this page in draft mode.

Prioritization of cell types responsive to biological perturbations in single-cell data with Augur

Abstract

Advances in single-cell genomics now enable large-scale comparisons of cell states across two or more experimental conditions. Numerous statistical tools are available to identify individual genes, proteins or chromatin regions that differ between conditions, but many experiments require inferences at the level of cell types, as opposed to individual analytes. We developed Augur to prioritize the cell types within a complex tissue that are most responsive to an experimental perturbation. In this protocol, we outline the application of Augur to single-cell RNA-seq data, proceeding from a genes-by-cells count matrix to a list of cell types ranked on the basis of their separability following a perturbation. We provide detailed instructions to enable investigators with limited experience in computational biology to perform cell-type prioritization within their own datasets and visualize the results. Moreover, we demonstrate the application of Augur in several more specialized workflows, including the use of RNA velocity for acute perturbations, experimental designs with multiple conditions, differential prioritization between two comparisons, and single-cell transcriptome imaging data. For a dataset containing on the order of 20,000 genes and 20 cell types, this protocol typically takes 1–4 h to complete.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Overview of the workflow for cell-type prioritization in single-cell data.
Fig. 2: Augur workflow.
Fig. 3: Basic workflow for cell-type prioritization (case study 1).
Fig. 4: Cell-type prioritization in RNA velocity (case study 2) and on clustering trees.
Fig. 5: Cell-type prioritization across multiple experimental conditions and differential prioritization (case studies 3 and 4).
Fig. 6: Cell-type prioritization in datasets confounded by batch effects.

Data availability

The datasets discussed in this protocol are available from Zenodo (https://doi.org/10.5281/zenodo.4473025). Accession codes for these datasets are provided in Table 2.

Code availability

Augur is a freely available software package written in the R programming language and released under the MIT license. Source code can be obtained from https://github.com/neurorestore/Augur. We provide support through the GitHub issues tracker at https://github.com/neurorestore/Augur/issues.

References

  1. 1.

    Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    CAS  PubMed  Article  Google Scholar 

  2. 2.

    Svensson, V., Vento-Tormo, R. & Teichmann, S. A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 13, 599–604 (2018).

    CAS  PubMed  Article  Google Scholar 

  3. 3.

    Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. 4.

    Tabula Muris Consortium. et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).

    Article  CAS  Google Scholar 

  5. 5.

    Han, X. et al. Mapping the mouse cell atlas by Microwell-Seq. Cell 173, 1307 (2018).

    CAS  PubMed  Article  Google Scholar 

  6. 6.

    Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).

    CAS  PubMed  Article  Google Scholar 

  7. 7.

    Plass, M. et al. Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics. Science 360, (2018).

  8. 8.

    Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Vieira Braga, F. A. et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat. Med. 25, 1153–1163 (2019).

    CAS  PubMed  Article  Google Scholar 

  11. 11.

    Mathys, H. et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 570, 332–337 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. 12.

    Grubman, A. et al. A single-cell atlas of entorhinal cortex from individuals with Alzheimer’s disease reveals cell-type-specific gene expression regulation. Nat. Neurosci. 22, 2087–2097 (2019).

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Smillie, C. S. et al. Intra- and inter-cellular rewiring of the human colon during ulcerative colitis. Cell 178, 714–730.e22 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).

    CAS  PubMed  Article  Google Scholar 

  15. 15.

    Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Tabula Muris Consortium. A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature 583, 590–595 (2020).

    Article  CAS  Google Scholar 

  17. 17.

    Svensson, V., da Veiga Beltrame, E. & Pachter, L. A curated database reveals trends in single-cell transcriptomics. Database https://doi.org/10.1093/database/baaa073 (2020).

  18. 18.

    Soneson, C. & Robinson, M. D. Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods 15, 255–261 (2018).

    CAS  PubMed  Article  Google Scholar 

  19. 19.

    Crowell, H. L. et al. muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat. Comun. 11, 6077 (2020).

    CAS  Article  Google Scholar 

  20. 20.

    Zimmerman, K. D., Espeland, M. A. & Langefeld, C. D. A practical solution to pseudoreplication bias in single-cell studies. Nat. Commun. 12, 738 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Rossi, M. A. et al. Obesity remodels activity and transcriptional state of a lateral hypothalamic brake on feeding. Science 364, 1271–1274 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018).

    CAS  PubMed  Article  Google Scholar 

  23. 23.

    Hashikawa, Y. et al. Transcriptional and spatial resolution of cell types in the mammalian habenula. Neuron 106, 743–758.e5 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. 24.

    Sathyamurthy, A. et al. Massively parallel single nucleus transcriptional profiling defines spinal cord neurons and their activity during behavior. Cell Rep 22, 2216–2225 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. 25.

    Hrvatin, S. et al. Neurons that regulate mouse torpor. Nature 583, 115–121 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Schirmer, L. et al. Neuronal vulnerability and multilineage diversity in multiple sclerosis. Nature 573, 75–82 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Avey, D. et al. Single-cell RNA-seq uncovers a robust transcriptional response to morphine by glia. Cell Rep 24, 3619–3629.e4 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Kotliarov, Y. et al. Broad immune activation underlies shared set point signatures for vaccine responsiveness in healthy individuals and disease activity in patients with lupus. Nat. Med. 26, 618–629 (2020).

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Skinnider, M. A. et al. Cell type prioritization in single-cell data. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0605-1 (2020).

  30. 30.

    La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  31. 31.

    Wagner, F. B. et al. Targeted neurotechnology restores walking in humans with spinal cord injury. Nature 563, 65–71 (2018).

    CAS  PubMed  Article  Google Scholar 

  32. 32.

    Formento, E. et al. Electrical spinal cord stimulation must preserve proprioception to enable locomotion in humans with spinal cord injury. Nat. Neurosci. 21, 1728–1741 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Hagai, T. et al. Gene expression variability across cells and species shapes innate immunity. Nature 563, 197–202 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  35. 35.

    Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, eaau5324 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  36. 36.

    Lareau, C. A. et al. Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nat. Biotechnol. 37, 916–924 (2019).

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Lin, L. I. A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268 (1989).

    CAS  PubMed  Article  Google Scholar 

  38. 38.

    Bentsen, M. A. et al. Transcriptomic analysis links diverse hypothalamic cell types to fibroblast growth factor 1-induced sustained diabetes remission. Nat. Commun. 11, 4458 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. 39.

    Kim, D.-W. et al. Multimodal analysis of cell types in a hypothalamic node controlling social behavior. Cell 179, 713–728.e17 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Wu, Y. E., Pan, L., Zuo, Y., Li, X. & Hong, W. Detecting activated cell populations using single-cell RNA-seq. Neuron 96, 313–329.e6 (2017).

    CAS  PubMed  Article  Google Scholar 

  41. 41.

    Skinnider, M. A., Squair, J. W. & Foster, L. J. Evaluating measures of association for single-cell transcriptomics. Nat. Methods 16, 381–386 (2019).

    CAS  PubMed  Article  Google Scholar 

  42. 42.

    Clevers, H. et al. What is your conceptual definition of “cell type” in the context of a mature organism? Cell Syst. 4, 255–259 (2017).

    Article  CAS  Google Scholar 

  43. 43.

    Trapnell, C. Defining cell types and states with single-cell genomics. Genome Res 25, 1491–1498 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Zappia, L. & Oshlack, A. Clustering trees: a visualization for evaluating clusterings at multiple resolutions. Gigascience 7, giy083 (2018).

    PubMed Central  Article  CAS  PubMed  Google Scholar 

  45. 45.

    Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Amezquita, R. A. et al. Orchestrating single-cell analysis with Bioconductor. Nat. Methods 17, 137–145 (2020).

    CAS  PubMed  Article  Google Scholar 

  47. 47.

    Luecken, M. D. & Theis, F. J. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol. Syst. Biol. 15, e8746 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Pliner, H. A., Shendure, J. & Trapnell, C. Supervised classification enables rapid annotation of cell atlases. Nat. Methods 16, 983–986 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. 50.

    Zhang, A. W. et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat. Methods 16, 1007–1015 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  51. 51.

    Abdelaal, T. et al. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol. 20, 194 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  52. 52.

    Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. 53.

    Petukhov, V. et al. dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments. Genome Biol. 19, 78 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  54. 54.

    Melsted, P. et al. Modular, efficient and constant-memory single-cell RNA-seq preprocessing. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00870-2 (2021).

  55. 55.

    Srivastava, A., Malik, L., Smith, T., Sudbery, I. & Patro, R. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biol. 20, 65 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  56. 56.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  Article  Google Scholar 

  57. 57.

    Chen, H. et al. Assessment of computational methods for the analysis of single-cell ATAC-seq data. Genome Biol. 20, 241 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  58. 58.

    Ilicic, T. et al. Classification of low quality cells from single-cell RNA-seq data. Genome Biol. 17, 29 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  59. 59.

    Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291.e9 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    McGinnis, C. S., Murrow, L. M. & Gartner, Z. J. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 8, 329–337.e4 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  62. 62.

    Bhattacherjee, A. et al. Cell type-specific transcriptional programs in mouse prefrontal cortex during adolescence and addiction. Nat. Commun. 10, 4169 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  63. 63.

    McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).

  64. 64.

    Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2018).

    Article  CAS  Google Scholar 

  65. 65.

    Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  66. 66.

    Hie, B., Bryson, B. & Berger, B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat. Biotechnol. 37, 685–691 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. 67.

    Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887.e17 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. 68.

    Polański, K. et al. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2020).

    PubMed  Google Scholar 

  69. 69.

    Tran, H. T. N. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol. 21, 12 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  70. 70.

    Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. 72.

    Büttner, M., Miao, Z., Wolf, F. A., Teichmann, S. A. & Theis, F. J. A test metric for assessing single-cell RNA-seq batch correction. Nat. Methods 16, 43–49 (2019).

    PubMed  Article  CAS  Google Scholar 

  73. 73.

    Lewitus, G. M. et al. Microglial TNF-α suppresses cocaine-induced plasticity and behavioral sensitization. Neuron 90, 483–491 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  74. 74.

    Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  75. 75.

    McDavid, A. et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics 29, 461–467 (2013).

    CAS  PubMed  Article  Google Scholar 

  76. 76.

    Ntranos, V., Yi, L., Melsted, P. & Pachter, L. A discriminative learning approach to differential expression analysis for single-cell RNA-seq. Nat. Methods 16, 163–166 (2019).

    CAS  PubMed  Article  Google Scholar 

  77. 77.

    Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  78. 78.

    Erhard, F. et al. scSLAM-seq reveals core features of transcription dynamics in single cells. Nature 571, 419–423 (2019).

    CAS  PubMed  Article  Google Scholar 

  79. 79.

    Phipson, B. & Smyth, G. K. Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn. Stat. Appl. Genet. Mol. Biol. 9, 39 (2010).

    Article  Google Scholar 

  80. 80.

    Schwartz, G. W. et al. TooManyCells identifies and visualizes relationships of single-cell clades. Nat. Methods 17, 405–413 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

M.A.S. acknowledges support from Wings for Life, the Canadian Institutes of Health Research (CIHR) (Vanier Canada Graduate Scholarship, Michael Smith Foreign Study Supplement), an Izaak Walton Killam Memorial Pre-Doctoral Fellowship, a UBC Four Year Fellowship, a Vancouver Coastal Health–CIHR–UBC MD/PhD Studentship, a Brain Canada Hubert van Tol fellowship and a BCRegMed Collaborative Research Travel Grant. J.W.S. is supported by a CIHR Banting postdoctoral fellowship and a Marie Skłodowska-Curie individual fellowship (No. 842578). Work in L.J.F.’s group is supported by Genome Canada/Genome BC (Project 264PRO). The present work was supported by a Consolidator Grant from the European Research Council (ERC-2015-CoG HOW2WALKAGAIN 682999) and the Swiss National Science Foundation (subside 310030_192558).

Author information

Affiliations

Authors

Contributions

J.W.S. and M.A.S. implemented all procedures and developed Augur. M.G. contributed to the procedures. L.J.F. and G.C. supervised the work. J.W.S., M.A.S. and G.C. wrote the manuscript. All authors edited the manuscript.

Corresponding authors

Correspondence to Jordan W. Squair, Michael A. Skinnider or Grégoire Courtine.

Ethics declarations

Competing interests

G.C. is a founder and shareholder of ONWARD Medical, a company with no direct relationships with the present work.

Additional information

Peer review information Nature Protocols thanks Lyla Atta, Jean Fan, Brendan Miller, Joshua Welch and the other, anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key reference using this protocol

Skinnider, M. et al. Nat. Biotechnol. 39, 30–34 (2021): https://doi.org/10.1038/s41587-020-0605-1

Key data used in this protocol

Kang, H. et al. Nat. Biotechnol. 36, 89–94 (2018): https://doi.org/10.1038/nbt.4042

Bhattacherjee, A. et al. Nat. Commun. 10, 4169 (2019): https://doi.org/10.1038/s41467-019-12054-3

Moffitt, J. R. et al. Science 362, eaau5324 (2018): https://doi.org/10.1126/science.aau5324

Extended data

Extended Data Fig. 1 Optimized workflow for differential cell-type prioritization.

Differential cell-type prioritization in the Moffitt et al., 201835 MERFISH dataset with variable numbers of permutations, subsamples per permutation, and total subsamples. a, Differential prioritization in the full dataset, with a background of 1,000 independent permutations. The top five cell types with a permutation P-value < 0.05 are shown throughout. be, Impact of reducing the number of permutations on differential prioritization. Differential prioritization yields stable results with the number of permutations decreased to 100, but becomes noisier below this threshold. Moreover, with 100 permutations, over 134 core-hours are required. b, Differential prioritization with 30 permutations (left) or 100 permutations (right). c, Correlations between differential prioritization –log10 P-values for each cell type in the reduced datasets with 30 permutations (left) or 100 permutations (right), compared with the full dataset of 1,000 permutations shown in a. d, Correlation of –log10 P-values to the full dataset for between 2 and 999 permutations. e, Total runtime required to perform between 1 and 1,000 permutations. f,g, Impact of reducing the number of subsamples on differential prioritization. A full 50 subsamples are required in each permuted dataset for accurate differential prioritization. f, Differential prioritization with one, five or ten subsamples per permutation. g, Correlations between differential prioritization –log10 P-values for each cell type in the reduced datasets with one, five or ten subsamples per permutation, compared with the complete dataset shown in a. h, Correlation coefficients to the full dataset with one, five or ten subsamples per permutation. Error bars show 95% confidence interval. i, Distribution of mean AUCs with 1, 5, 10 or 50 subsamples per permutation. The variance of null distribution is inflated with <50 subsamples per permutation, which precludes differential prioritization. j, Distribution of mean AUCs in the complete dataset of 1,000 permutations (‘default’), or an equivalent number of mean AUCs sampled with replacement from a background of 100, 500 or 1,000 total subsamples, with 50 subsamples per permutation. The null distribution using sampling with replacement is indistinguishable from the null distribution in the complete dataset. kn, Sampling with replacement enables accurate differential prioritization at dramatically reduced computational cost, providing an optimized workflow for differential prioritization. k, Differential prioritization after sampling with replacement from a background of 100, 500 or 1,000 total subsamples. The original results from the complete dataset are approximated with 500 or more subsamples. l, Correlation of –log10 P-values to the full dataset for between 100, 500 and 1,000 total permutations. m, Correlation coefficients to the full dataset with between 50 and 1,000 mean AUCs drawn from a background of 100, 500 or 1,000 subsamples. n, Total runtime required to perform the full permutation analysis versus 100, 500 or 1,000 total permutations using augur_mode = "permute”.

Extended Data Fig. 2 Augur outperforms DE-based methods with subsampling.

Cell-type prioritization in simulated scRNA-seq data74 from a tissue with eight cell types and increasingly unequal numbers of cells per type, as quantified by the Gini coefficient29. The average number of DE genes at 5% false discovery rate in 50 subsamples of 20 cells per condition was tallied using six different statistical tests (t-test, Wilcoxon rank-sum test, likelihood ratio test75, logistic regression76, MAST77 and a negative binomial generalized linear model), implemented through the Seurat ‘FindMarkers’ function. The accuracy of cell-type prioritization was quantified as the Pearson correlation between the cell-type prioritizations (AUC or average number of DE genes, for Augur and single-cell differential expression tests, respectively) and the true proportion of DE genes under the simulation ground truth. The mean of five simulation replicates is shown throughout. Insets show binomial P-values for the sign of the difference in correlations (that is, the frequency with which Augur outperforms single-cell differential expression with subsampling), all with n = 120. a,b, Impact of perturbation intensity (differential expression effect size) on cell-type prioritization for a representative test for single-cell differential gene expression (Wilcoxon rank-sum test). Augur outperforms single-cell differential expression with subsampling in prioritizing cell types in the context of by subtler perturbations. c,d, Impact of sequencing depth (% of reads downsampled) on cell-type prioritization for a representative test for single-cell differential gene expression (Wilcoxon rank-sum test), with the location parameter of the differential expression factor log-normal distribution set to 0.5. Augur outperforms single-cell differential expression with subsampling in more sparsely sequenced datasets. e,f, Impact of perturbation intensity on cell-type prioritization for five additional tests for single-cell differential gene expression. g,h, Impact of sequencing depth on cell-type prioritization for five additional tests for single-cell differential gene expression.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Squair, J.W., Skinnider, M.A., Gautier, M. et al. Prioritization of cell types responsive to biological perturbations in single-cell data with Augur. Nat Protoc 16, 3836–3873 (2021). https://doi.org/10.1038/s41596-021-00561-x

Download citation

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing