Abstract
Current methods for comparing single-cell RNA sequencing datasets collected in multiple conditions focus on discrete regions of the transcriptional state space, such as clusters of cells. Here we quantify the effects of perturbations at the single-cell level using a continuous measure of the effect of a perturbation across the transcriptomic space. We describe this space as a manifold and develop a relative likelihood estimate of observing each cell in each of the experimental conditions using graph signal processing. This likelihood estimate can be used to identify cell populations specifically affected by a perturbation. We also develop vertex frequency clustering to extract populations of affected cells at the level of granularity that matches the perturbation response. The accuracy of our algorithm at identifying clusters of cells that are enriched or depleted in each condition is, on average, 57% higher than the next-best-performing algorithm tested. Gene signatures derived from these clusters are more accurate than those of six alternative algorithms in ground truth comparisons.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
Code availability
Code for the MELD and VFC algorithms implemented in Python is available as part of the MELD package on GitHub (https://github.com/KrishnaswamyLab/MELD) and on the Python Package Index. The GitHub repository also contains tutorials, code to reproduce the analysis of the zebrafish dataset and code associated with several of the quantitative comparisons.
References
Luecken, M. D. & Theis, F. J. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol. Syst. Biol. 15, e8746 (2019).
Weinreb, C., Wolock, S., Klein, A. M. & Berger, B. SPRING: a kinetic interface for visualizing high dimensional single-cell expression data. Bioinformatics 34, 1246–1248 (2018).
Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).
van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729 (2018).
Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. Cell 166, 1308–1323 (2016).
Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).
Xu, C. & Su, Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31, 1974–1980 (2015).
Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2019).
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).
Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).
Gao, X., Hu, D., Gogol, M. & Li, H. ClusterMap: comparing analyses across multiple single cell RNA-seq profiles. Bioinformatics 35, 3038–3045 (2018).
Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018).
Farrell, J. A. et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 360, eaar3131 (2018).
Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Milo: differential abundance testing on single-cell data using k-NN graphs | Preprint at bioRxiv https://doi.org/10.1101/2020.11.23.393769 (2020).
Büttner, M., Ostner, J., Müller, C., Theis, F. & Schubert, B. scCODA: a Bayesian model for compositional single-cell data analysis. Preprint at bioRxiv https://doi.org/10.1101/2020.12.14.422688 (2020).
Moon, K. R. et al. Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46 (2018).
Shuman, D. I., Narang, S. K., Frossard, P., Ortega, A. & Vandergheynst, P. The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30, 83–98 (2013).
Botev, Z. I., Grotowski, J. F. & Kroese, D. P. Kernel density estimation via diffusion. Ann. Stat. 38, 2916–2957 (2010).
Shuman, D. I., Vandergheynst, P. & Frossard, P. Chebyshev polynomial approximation for distributed signal processing. In: Distributed Computing in Sensor Systems and Workshops (DCOSS). 2011 International Conference on Distributed Computing in Sensor Systems, 1–8 (IEEE, 2011).
Shuman, D. I., Ricaud, B. & Vandergheynst, P. Vertex-frequency analysis on graphs. Applied Comput. Harmon. Anal. 40, 260–291 (2016).
Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).
Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).
DePasquale, E. A. K. et al. CellHarmony: cell-level matching and holistic comparison of single-cell transcriptomes. Nucleic Acids Res. 47, e138–e138 (2019).
Fischer, D. Theislab/diffxpy. Theis Lab https://github.com/theislab/diffxpy (2020).
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Yen, S.-T. et al. Somatic mosaicism and allele complexity induced by CRISPR/Cas9 RNA injections in mouse zygotes. Dev. Biol. 393, 3–9 (2014).
Hammerschmidt, M. et al. Dino and mercedes, two genes regulating dorsal development in the zebrafish embryo. Development 123, 95–102 (1996).
Schulte-Merker, S., Lee, K. J., McMahon, A. P. & Hammerschmidt, M. The zebrafish organizer requires chordino. Nature 387, 862–863 (1997).
Fisher, S. & Halpern, M. E. Patterning the zebrafish axial skeleton requires early chordin function. Nat. Genet. 23, 442–446 (1999).
Ablamunits, V., Elias, D., Reshef, T. & Cohen, I. R. Islet T cells secreting IFN-γ in NOD mouse diabetes: arrest by p277 peptide treatment. J. Autoimmun. 11, 73–81 (1998).
Lopes, M. et al. Temporal profiling of cytokine-induced genes in pancreatic β-cells by meta-analysis and network inference. Genomics 103, 264–275 (2014).
Muraro, M. J. et al. A single-cell transcriptome atlas of the human pancreas. Cell Syst. 3, 385–394 (2016).
Xin, Y. et al. Pseudotime ordering of single human β-cells reveals states of insulin production and unfolded protein response. Diabetes 67, 1783–1794 (2018).
Farack, L. et al. Transcriptional heterogeneity of beta cells in the intact pancreas. Dev. Cell 48, 115–125 (2019).
Ramana, C. V., Gil, M. P., Schreiber, R. D. & Stark, G. R. Stat1-dependent and -independent pathways in IFN-γ-dependent signaling. Trends Immunol. 23, 96–101 (2002).
Sadler, A. J. & Williams, B. R. G. Interferon-inducible antiviral effectors. Nat. Rev. Immunol. 8, 559–568 (2008).
Fitzgerald, K. A. The interferon inducible gene: viperin. J. Interferon Cytokine Res. 31, 131–135 (2011).
Zheng, Z., Wang, L. & Pan, J. Interferon-stimulated gene 20-kDa protein (ISG20) in infection and disease: review and outlook. Intractable Rare Dis. Res. 6, 35–40 (2017).
Hultcrantz, M. et al. Interferons induce an antiviral state in human pancreatic islet cells. Virology 367, 92–101 (2007).
Stewart, A. F. et al. Human β-cell proliferation and intracellular signaling: part 3. Diabetes 64, 1872–1885 (2015).
Chen, X. et al. MLL-AF9 initiates transformation from fast-proliferating myeloid progenitors. Nat. Commun. 10, 5767 (2019).
Dutrow, E. V. et al. The human accelerated region HACNS1 modifies developmental gene expression in humanized mice. Preprint at https://www.biorxiv.org/content/10.1101/2019.12.11.873075v1 (2019).
Savell, K. E. et al. A dopamine-induced gene expression signature regulates neuronal function and cocaine response. Sci. Adv. 6, eaba4221 (2020).
Chung, K. M. et al. Endocrine–exocrine signaling drives obesity-associated pancreatic ductal adenocarcinoma. Cell 181, 832–847 (2020).
Ravindra, N. G. et al. Single-cell longitudinal analysis of SARS-CoV-2 infection in human airway epithelium. Preprint at https://www.biorxiv.org/content/10.1101/2020.05.06.081695v2 (2020).
Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Coifman, R. R. & Lafon, S. Diffusion maps. Applied Comput. Harmon. Anal. 21, 5–30 (2006).
Mack, Y. P. & Rosenblatt, M. Multivariate k-nearest neighbor density estimates. J. Multivar. Anal. 9, 1–15 (1979).
Biau, G., Chazal, F., Cohen-Steiner, D., Devroye, L. & Rodríguez, C. A weighted k-nearest neighbor density estimate for geometric inference. Electron. J. Stat. 5, 204–237 (2011).
Kung, Y.-H., Lin, P.-S. & Kao, C.-H. An optimal k-nearest neighbor for density estimation. Stat. Probabil. Lett. 82, 1786–1791 (2012).
Von Luxburg, U. & Alamgir, M. Density estimation from unweighted k-nearest neighbor graphs: a roadmap. In: Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q. (eds.) Advances in Neural Information Processing Systems 26, 225–233 (Curran Associates, 2013).
Silverman, B. W. Density Estimation for Statistics and Data Analysis (Routledge, 2018).
Hammond, D. K., Vandergheynst, P. & Gribonval, R. Wavelets on graphs via spectral graph theory. Applied Comput. Harmon. Anal. 30, 129–150 (2011).
Perraudin, N., Ricaud, B., Shuman, D. & Vandergheynst, P. Global and local uncertainty principles for signals on graphs. APSIPA Trans. Signal Inform. Process. 7, E3 (2018); https://doi.org/10.1017/ATSIP.2018.2
Mallat, S.A. Wavelet Tour of Signal Processing: The Sparse Way (Academic Press, 2008).
Zhou, D. & Schölkopf, B. A regularization framework for learning from graph data. In: ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields 15, 67–68 (2004).
Ham, J., Lee, D. D. & Saul, L. K. Semisupervised alignment of manifolds. Proc. Annu. Conf. Uncertainty in Artificial Intelligence (eds Ghahramani, Z. & Cowell, R.) (AUAI Press, 2005).
Belkin, M., Matveeva, I. & Niyogi, P. Regularization and semi-supervised learning on large graphs. In: International Conference on Computational Learning Theory, 624–638 (Springer, 2004).
Ando, R. K. & Zhang, T. Learning on graph with Laplacian regularization. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 25–32 (MIT Press, 2007).
Weinberger, K. Q., Sha, F., Zhu, Q. & Saul, L. K. Graph Laplacian regularization for large-scale semidefinite programming. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 1489–1496 (MIT Press, 2007).
He, X., Ji, M., Zhang, C. & Bao, H. A variance minimization criterion to feature selection using Laplacian regularization. IEEE Trans. Pattern Anal. Mach. Intell. 33, 2013–2025 (2011).
Liu, X., Zhai, D., Zhao, D., Zhai, G. & Gao, W. Progressive image denoising through hybrid graph Laplacian regularization: a unified framework. IEEE Trans. Image Process. 23, 1491–1503 (2014).
Pang, J., Cheung, G., Ortega, A. & Au, O. C. Optimal graph Laplacian regularization for natural image denoising. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2294–2298 (IEEE, 2015).
Pang, J. & Cheung, G. Graph Laplacian regularization for image denoising: analysis in the continuous domain. IEEE Trans. Image Process. 26, 1770–1785 (2017).
Perraudin, N. et al. GSPBOX: a toolbox for signal processing on graphs. Preprint at https://arxiv.org/abs/1408.5781 (2016).
Barron, M. & Li, J. Identifying and removing the cell-cycle effect from single-cell RNA-sequencing data. Sci. Rep. 6, 33892 (2016).
Belkin, M. & Niyogi, P. Convergence of Laplacian eigenmaps. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 129–136 (MIT Press, 2006).
Coifman, R. R. & Maggioni, M. Diffusion wavelets. Applied Comput. Harmon. Anal. 21, 53–94 (2006).
Chaudhuri, P. & Marron, J. S. Scale space view of curve estimation. Ann. Stat. 28, 408–428 (2000).
Perraudin, N., Holighaus, N., Søndergaard, P. L. & Balazs, P. Designing Gabor windows using convex optimization. Appl. Math. Comput. 330, 266–287 (2018).
Ng, A. Y., Jordan, M. I. & Weiss, Y. On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems 849–856 (NIPS, 2001).
Acknowledgements
The authors would like to thank C. Vejnar, R. Coifman, J. Noonan, V. Tornini and C. Kontur for fruitful discussions. We would also like to thank G. Wang of the Yale Center for Genome Analysis for help in preparing the pancreatic islet data. This research was supported, in part, by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institues of Health (NIH) (award no. F31HD097958) (to D.B.); the Gruber Foundation (to S.G.); IVADO Professor startup and operational funds, IVADO Fundamental Research Project grant PRF-2019-3583139727 (to G.W.); NIH grants R01GM135929 and R01GM130847 (to G.W. and S.K.); and Chan-Zuckerberg Initiative grants 182702 and CZF2019-002440 (to S.K.). The content provided here is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
Author information
Authors and Affiliations
Contributions
D.B.B., S.K., G.W., D.v.D. and A.J.G. envisioned the project. D.B.B., J.S., A.T., S.K. and G.W. developed the mathematical formulation of the problem and related numerical analysis. D.B.B., J.S. and S.G. implemented the code. D.B.B. and S.K. performed the analysis of biological and simulated data. A.L.P. and K.C.H. generated and assisted with the analysis of the pancreatic islet dataset. A.J.G. assisted with the analysis of the zebrafish data and related writing. D.B.B., J.S., A.T., S.K. and G.W. wrote the paper. S.G. assisted with the writing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare the following competing interest: S.K. is a paid scientific advisor to AI Therapeutics.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–14, Tables 1 and 2 and Notes 1–3
Rights and permissions
About this article
Cite this article
Burkhardt, D.B., Stanley, J.S., Tong, A. et al. Quantifying the effect of experimental perturbations at single-cell resolution. Nat Biotechnol 39, 619–629 (2021). https://doi.org/10.1038/s41587-020-00803-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-020-00803-5
This article is cited by
-
Precise identification of cell states altered in disease using healthy single-cell references
Nature Genetics (2023)
-
Causal identification of single-cell experimental perturbation effects with CINEMA-OT
Nature Methods (2023)
-
PD-1 maintains CD8 T cell tolerance towards cutaneous neoantigens
Nature (2023)
-
Single-cell genomics meets human genetics
Nature Reviews Genetics (2023)
-
A Wox3-patterning module organizes planar growth in grass leaves and ligules
Nature Plants (2023)