Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Supervised learning of high-confidence phenotypic subpopulations from single-cell data

An Author Correction to this article was published on 06 June 2023

This article has been updated

A preprint version of the article is available at bioRxiv.

Abstract

Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here by deploying a Learning with Rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to simultaneously select informative features and identify cell subpopulations, enabling accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIL’s versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyse one million cells within 1 h. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to single-cell RNA sequencing of a patient with mantle cell lymphoma with drug treatment across multiple timepoints, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The workflow of PENCIL and its main functions.
Fig. 2: Evaluation of PENCIL’s classification mode for simultaneously selecting genes and cells in simulations.
Fig. 3: Evaluation of regression mode of PENCIL on the simulated datasets.
Fig. 4: The running time and memory usages of PENCIL against the number of cells.
Fig. 5: PENCIL analysis of T-cell subpopulations associated with melanoma immunotherapy outcomes.
Fig. 6: Regression mode of PENCIL analysis of scRNA-seq malignant B cells across three timepoints from a patient with MCL.

Similar content being viewed by others

Data availability

Publicly available scRNA-seq studies can be accessed via the following accession numbers or the link provided: GSE120575 (ref. 6), GSE159251 (ref. 32), GSE134388 (ref. 33) and https://zenodo.org/record/7761954 (ref. 34). More detailed description of these datasets can be found in Supplementary Material.

Code availability

The open-source PENCIL program and its tutorials are freely available at GitHub (https://github.com/cliffren/PENCIL) and Zenodo (https://doi.org/10.5281/zenodo.7762054).

Change history

References

  1. Miao, Y. et al. Adaptive immune resistance emerges from tumor-initiating stem cells. Cell 177, 1172–1186 e1114 (2019).

    Article  Google Scholar 

  2. Wagner, J. et al. A single-cell atlas of the tumor and immune ecosystem of human breast cancer. Cell 177, 1330–1345 e1318 (2019).

    Article  Google Scholar 

  3. Trapnell, C. Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015).

    Article  Google Scholar 

  4. Stephenson, E. et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat. Med. 27, 904–916 (2021).

    Article  Google Scholar 

  5. Ekiz, H. A. et al. MicroRNA-155 coordinates the immunological landscape within murine melanoma and correlates with immunity in human cancers. JCI Insight 4, e126543 (2019).

    Google Scholar 

  6. Sade-Feldman, M. et al. Defining T cell states associated with response to checkpoint immunotherapy in melanoma. Cell 175, 998–1013 e1020 (2018).

    Article  Google Scholar 

  7. Eksi, S. E. et al. Epigenetic loss of heterogeneity from low to high grade localized prostate tumours. Nat. Commun. 12, 7292 (2021).

    Article  Google Scholar 

  8. Lun, A. T. L., Richard, A. C. & Marioni, J. C. Testing for differential abundance in mass cytometry data. Nat. Methods 14, 707–709 (2017).

    Article  Google Scholar 

  9. Zhao, J. et al. Detection of differentially abundant cell subpopulations in scRNA-seq data. Proc. Natl Acad. Sci. USA 118, e2100293118. (2021).

    Article  Google Scholar 

  10. Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Differential abundance testing on single-cell data using k-nearest neighbor graphs. Nat. Biotechnol. 40, 245–253 (2022).

    Article  Google Scholar 

  11. Burkhardt, D. B. et al. Quantifying the effect of experimental perturbations at single-cell resolution. Nat. Biotechnol. 39, 619–629 (2021).

    Article  Google Scholar 

  12. Sheng, J. & Li, W. V. Selecting gene features for unsupervised analysis of single-cell gene expression data. Brief. Bioinform. 22, bbab295 (2021).

    Article  Google Scholar 

  13. Townes, F. W., Hicks, S. C., Aryee, M. J. & Irizarry, R. A. Feature selection and dimension reduction for single-cell RNA-seq based on a multinomial model. Genome Biol. 20, 295 (2019).

    Article  Google Scholar 

  14. Farrell, J. A. et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 360, eaar3131 (2018).

    Article  Google Scholar 

  15. Zhong, S. et al. A single-cell RNA-seq survey of the developmental landscape of the human prefrontal cortex. Nature 555, 524–528 (2018).

    Article  Google Scholar 

  16. Baran-Gale, J. et al. Ageing compromises mouse thymus function and remodels epithelial cell differentiation. eLife 9, e56221 (2020).

    Article  Google Scholar 

  17. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).

    Article  Google Scholar 

  18. Chen, H. et al. Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM. Nat. Commun. 10, 1903 (2019).

    Article  Google Scholar 

  19. Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).

    Article  Google Scholar 

  20. Lange, M. et al. CellRank for directed single-cell fate mapping. Nat. Methods 19, 159–170 (2022).

    Article  Google Scholar 

  21. Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. https://doi.org/10.1038/nbt.4314 (2018).

  22. Cannoodt, R., Saelens, W., Deconinck, L. & Saeys, Y. Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells. Nat. Commun. 12, 3942 (2021).

    Article  Google Scholar 

  23. Chen, W. et al. A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples. Nat. Biotechnol. 39, 1103–1114 (2021).

    Article  Google Scholar 

  24. Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).

    Article  Google Scholar 

  25. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 e3529 (2021).

    Article  Google Scholar 

  26. Ruan, X. et al. Progenitor cell diversity in the developing mouse neocortex. Proc. Natl Acad. Sci. USA 118, e2018866118 (2021).

    Article  Google Scholar 

  27. Van den Berge, K. et al. Trajectory-based differential expression analysis for single-cell sequencing data. Nat. Commun. 11, 1201 (2020).

    Article  Google Scholar 

  28. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods 14, 309–315 (2017).

    Article  Google Scholar 

  29. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    Article  Google Scholar 

  30. Li, H. et al. Dysfunctional CD8 T cells form a proliferative, dynamically regulated compartment within human melanoma. Cell 176, 775–789 e718 (2019).

    Article  Google Scholar 

  31. Scott, A. C. et al. TOX is a critical regulator of tumour-specific T cell differentiation. Nature 571, 270–274 (2019).

    Article  Google Scholar 

  32. Pauken, K. E. et al. Single-cell analyses identify circulating anti-tumor CD8 T cells and markers for their enrichment. J. Exp. Med. 218, e20200920 (2021).

    Article  Google Scholar 

  33. Li, N. et al. ALKBH5 regulates anti-PD-1 therapy response by modulating lactate and suppressive immune cell accumulation in tumor microenvironment. Proc. Natl Acad. Sci. USA 117, 20159–20170 (2020).

    Article  Google Scholar 

  34. Torka, P. et al. Pevonedistat, a Nedd8-activating enzyme inhibitor, in combination with ibrutinib in patients with relapsed/refractory B-cell non-Hodgkin lymphoma. Blood Cancer J. 13, 9 (2023).

    Article  Google Scholar 

  35. Tickle, T., Tirosh, I., Georgescu, C., Brown, M. & Haas, B. inferCNV of the Trinity CTAT Project. Klarman Cell Observatory, Broad Institute of MIT and Harvard. https://github.com/broadinstitute/inferCNV (2019).

  36. Hartmann, E. M. et al. Pathway discovery in mantle cell lymphoma by integrated analysis of high-resolution gene expression and copy number profiling. Blood 116, 953–961 (2010).

    Article  Google Scholar 

  37. Mathas, S. et al. Aberrantly expressed c-Jun and JunB are a hallmark of Hodgkin lymphoma cells, stimulate proliferation and synergize with NF-kappa B. EMBO J. 21, 4104–4113 (2002).

    Article  Google Scholar 

  38. Papoudou-Bai, A. et al. The expression levels of JunB, JunD and p-c-Jun are positively correlated with tumor cell proliferation in diffuse large B-cell lymphomas. Leuk. Lymphoma 57, 143–150 (2016).

    Article  Google Scholar 

  39. Balaji, S. et al. NF-kappaB signaling and its relevance to the treatment of mantle cell lymphoma. J. Hematol. Oncol. 11, 83 (2018).

    Article  Google Scholar 

  40. Godbersen, J. C. et al. The Nedd8-activating enzyme inhibitor MLN4924 thwarts microenvironment-driven NF-kappaB activation and induces apoptosis in chronic lymphocytic leukemia B cells. Clin. Cancer Res. 20, 1576–1589 (2014).

    Article  Google Scholar 

  41. Mulqueen, R. M. et al. Highly scalable generation of DNA methylation profiles in single cells. Nat. Biotechnol. 36, 428–431 (2018).

    Article  Google Scholar 

  42. Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).

    Article  Google Scholar 

  43. Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).

    Article  Google Scholar 

  44. Bartlett, P. L. & Wegkamp, M. H. Classification with a reject option using a hinge loss. J. Mach. Learn. Res. 9, 1823–1840 (2008).

    MathSciNet  MATH  Google Scholar 

  45. Cortes, C., DeSalvo, G. & Mohri, M. Learning with Rejection. Lect. Notes Artif. Intell. 9925, 67–82 (2016).

    MathSciNet  MATH  Google Scholar 

  46. Herbei, R. & Wegkamp, M. H. Classification with reject option. Can. J. Stat. 34, 709–721 (2006).

    Article  MathSciNet  MATH  Google Scholar 

  47. Asif, A. & Minhas, F. U. A. Generalized neural framework for learning with rejection. International Joint Conference on Neural Networks (IJCNN). https://doi.org/10.1109/IJCNN48605.2020.9206612 (IEEE, 2020).

  48. Charoenphakdee, N., Cui, Z. H., Zhang, Y. A. & Sugiyama, M. Classification with rejection based on cost-sensitive classification. Proc. Mach. Learn. Res. 139, 1507–1517 (2021).

    Google Scholar 

  49. Misra, D. Mish: a self regularized non-monotonic activation function. Preprint at arXiv https://doi.org/10.48550/arXiv.1908.08681 (2019).

  50. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    Article  Google Scholar 

  51. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 e1821 (2019).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the following funding: the National Key Research and Development Program of China 2020YFA0712400 (to T.R. and L.-Y.W.); NIH 1R21HL145426 (to Z.X.); Department of Defense Idea Development Award W81XWH2110539 (to Z.X.); Breast Cancer Research Foundation and NIH U01CA253472 and U01CA217842 (to G.B.M.); NIH 1R01CA244576 (to A.V.D.); NIH R35GM124704 (to A.C.A.); NIH R01CA250917 (to M.H.S.). We thank J. Zeng (University of Macau), and all the members of his bioinformatics team for generously sharing their experience and codes. We thank W. Anderson for helping edit the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Z.X. conceived the idea. T.R., L.-Y.W. and Z.X. implemented the method and performed the analyses. T.R., C.C., A.V.D, S.L., X.G., S.D., L.-Y.W. and Z.X. interpreted the results. X.W., M.H.S., A.C.A., P.T.S., L.M.C. and G.B.M. provided scientific insights on the applications. A.C.A. and G.B.M. contributed to the analytic strategies. L.-Y.W. and Z.X. supervised the study. T.R., L.-Y.W. and Z.X. wrote the manuscript with feedback from all other authors. All the authors read and approved the final manuscript.

Corresponding authors

Correspondence to Ling-Yun Wu or Zheng Xia.

Ethics declarations

Competing interests

A.V.D. has received consulting fees from Abbvie, AstraZeneca, Bayer Oncology, BeiGene, Bristol Meyers Squibb, Genentech, Incyte, Lilly Oncology, Morphposys, Nurix, Oncovalent, Pharmacyclics and TG Therapeutics and has ongoing research funding from Abbvie, AstraZeneca, Bayer Oncology, Bristol Meyers Squibb, Cyclacel, MEI Pharma, Nurix and Takeda Oncology. X.G. is a Genentech employee and Roche shareholder. G.B.M. is SAB/Consultant for AstraZeneca, BlueDot, Chrysallis Biotechnology, Ellipses Pharma, ImmunoMET, Infinity, Ionis, Lilly, Medacorp, Nanostring, PDX Pharmaceuticals, Signalchem Lifesciences, Tarveda, Turbine and Zentalis Pharmaceuticals; stock/options/financial: Catena Pharmaceuticals, ImmunoMet, SignalChem, Tarveda and Turbine; licenced technology: HRD assay to Myriad Genetics, and DSP patents with Nanostring. L.M.C. provides consulting services for Cell Signaling Technologies, AbbVie, the Susan G Komen Foundation and Shasqi, received reagent and/or research support from Cell Signaling Technologies, Syndax Pharmaceuticals, ZelBio Inc., Hibercell Inc. and Acerta Pharma, and participates in advisory boards for Pharmacyclics, Syndax, Carisma, Verseau, CytomX, Kineta, Hibercell, Cell Signaling Technologies, Alkermes, Zymeworks, Genenta Sciences, Pio Therapeutics Pty Ltd, PDX Pharmaceuticals, the AstraZeneca Partner of Choice Network, the Lustgarten Foundation and the NIH/NCI-Frederick National Laboratory Advisory Committee. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Yun Li and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 A simple simulation consists of cells from two conditions and three cell types, each containing only two genes (X_1 and X_2).

a, Visualizing cells from two conditions colored by condition labels using the two genes. b, Standard clustering of the cells. Cell number in parentheses. c, Percentage of cell condition labels within each cluster. d, The identified phenotypic subpopulations from the clustering-based method. e, The learned prediction model from PENCIL with the orange line as the boundary with prediction scores (𝑥) = 0 to classify the two conditions. Cells colored by the condition labels as in a. f, The learned rejection model from PENCIL with the green curve as the boundary with confidence scores 𝑟(𝑥) = 0 to reject cells. Cells colored by the condition labels as in a. g, PENCIL identified phenotypic subpopulations.

Extended Data Fig. 2 A simple simulation includes cells from two conditions and three cell types, each containing only two genes (X_1 and X_2), but lacks enriched phenotypic subpopulations.

a, Visualizing cells from two conditions colored by condition labels using the two genes. b, Standard clustering of the cells. Cell number in parentheses. c, The equal percentages of cell condition labels within each cluster. d, The result of the clustering-based method showing no subpopulations associated with the phenotypes. e, The learned prediction model from PENCIL with the orange line as the boundary with prediction scores (𝑥) = 0 to classify the two conditions. Cells colored by the condition labels as in a. f, The rejection module in PENCIL with all confidence scores 𝑟(x) < 0 to reject all cells. g, PENCIL rejected all cells.

Extended Data Fig. 3 The simulation flowchart.

a, The matrix from a real scRNA-seq dataset. b, Selecting a submatrix with a subset of genes as indicated by the orange rectangle for the following clustering. c, UMAP visualization and standard clustering based on the submatrix from the previous step. d, Selecting two clusters from panel c as the ground truth subpopulations enriched in the phenotypes, respectively. e, Assigning cells with condition labels based on the designed conditions in panel d and the given mixing rate. f, The raw matrix with each cell assigned with a condition label as indicated on the top bar. g, The UMAP using the top 2000 MVGs colored by the condition labels of cells. h, The raw expression matrix and cell condition labels as the same inputs for all the methods.

Extended Data Fig. 4 PENCIL classification analysis of simulated datasets with two conditions.

a, The confidence scores output by PENCIL. b, The distribution of the selected and rejected cells over the simulated ground truth of the conditions. c, The Venn diagram showing the overlap between the PENCIL selected cells and the ground truth phenotypic cells for the two conditions, respectively. d, The F1, precision and recall scores comparing the performances of the four methods.

Extended Data Fig. 5 Evaluating PENCIL on the simulated datasets with batch-effect.

a, UMAP based on the manually curated genes showing the cells of two conditions from two batches separated by the dashed line. b, UMAP based on the manually curated genes showing the cells of two conditions after batch corrections. c, UMAP based on the top 3000 MVGs showing all cells. d, PENCIL selected genes. e, UMAP based on the PENCIL selected genes showing the PENCIL selected cells. f, The Venn diagram showing the overlap between the ground truth phenotype-enriched subpopulations and the PENCIL selected cells. g, The box plots comparing the performances of PENCIL, Milo, DAseq and MELD in simulated batch effects datasets with mixing rates 0, 0.1, 0.2 and 0.3 (n = 50 simulations). In the box plots, the center line and the box bounds represent median value and upper and lower quartiles, respectively. Box whiskers indicate the largest and smallest values no more than 1.5 times the interquartile range from the quartiles.

Extended Data Fig. 6 Evaluating PENCIL on the simulated datasets with three conditions.

a, The UMAP based on the pre-selected gene set colored by the cell condition labels generated from ground truth cell subsets with a mixing rate of 0.1. b, The ground truth phenotype-associated subpopulations visualized on the UMAP using the top 2000 MVGs. c, Cells with the same condition labels as the ones in the panel a visualized on the UMAP using the top 2000 MVGs. d, The UMAP based on the pre-selected genes colored by the PENCIL predicted confidence scores. e, The distribution of PENCIL selected cells over the ground truth cell conditions. f, The F1, precision and recall scores comparing the performances of the three methods on this simulated dataset with three conditions. g, The Venn diagrams depicting the overlap between ground truth cell subpopulations and cell subsets selected by the three methods, respectively.

Extended Data Fig. 7 Evaluating the four methods using the PENCIL selected genes as inputs.

a, The results of PENCIL, Milo, DAseq and MELD when inputting the genes selected by PENCIL. b, The Venn diagrams comparing the result of each method with the ground truth phenotypic cell subpopulations. c, The F1, precision, and recall scores comparing the performances of the four methods when inputting the genes selected by PENCIL. d-f, A simulation for the three conditions. d, The UMAP plots showing the results of PENCIL, Milo, and MELD when inputting the genes selected by PENCIL. e, The Venn diagrams comparing the result of each method with the ground truth phenotypic cell subpopulations. f, The F1, precision and recall scores comparing the performances of the three methods when inputting the genes selected by PENCIL in this simulated example with three conditions.

Extended Data Fig. 8 Evaluating the regression model of PENCIL in simulated datasets.

a, UMAP showing the cells of 5 clusters selected as ground truth subpopulations corresponding to main Fig. 3a. b, PENCIL predicted confidence scores corresponding to main Fig. 3c. c, The cells with simulated condition labels visualized on the UMAP using the top 2000 MVGs. d, PENCIL predicted confidence scores corresponding to main Fig. 3k. e-i, A simulated dataset for PENCIL regression analysis from the Feldman T-cell dataset. e, UMAP from a pre-selected gene set (800-1500th MVGs) to show cells with simulated ground truth phenotypic subpopulations of five time points. f, The five subpopulations are assigned to the five samples accordingly, and all remaining cells are evenly assigned to the five samples to simulate the sample labels. The UMAP is the same as the panel e colored by cell condition labels. g, Ground truth of phenotype-associated subpopulations in panel e visualized on the UMAP using the top 2000 MVGs. h, PENCIL predicted continuous time points for the selected cells. i, PENCIL selected genes. Genes within the dashed rectangle region were the gene-set to generate UMAPs in panels e, f and h. j-n, A simulated dataset for PENCIL regression analysis from the Sade-Feldman cohort dataset. j, UMAP from a pre-selected gene set (1500-2000th MVGs) to show cells with simulated ground truth subpopulations of four time points. k, The four subpopulations are assigned to the four samples accordingly and all remaining cells are evenly assigned to the four samples. The UMAP is the same as the panel j colored by simulated condition labels. l, Ground truth of phenotype-associated subpopulations in panel j visualized on the UMAP using top 2000 MVGs. m, PENCIL predicted continuous time points for the selected cells. n, PENCIL selected genes. Genes within the dashed rectangle region were the gene set to generate UMAPs in panels j, k and m.

Extended Data Fig. 9 PENCIL’s runtime and memory usages with varying numbers of genes and conditions.

a-c, For datasets with 10,000 cells and three conditions, the runtime, overall memory usage of CPU and GPU against the number of genes, respectively. d-f, For datasets with 10,000 cells and 2000 genes, the runtime, overall memory usage of CPU and GPU against the number of conditions, respectively. MiB, mebibyte.

Extended Data Fig. 10 A summary of PENCIL’s two modes.

a, The advantages of classification-based PENCIL. b, The regression mode of PENCIL formulates a new application to reveal a continuous dynamic process.

Supplementary information

Supplementary Information

Supplementary notes 1–7, figs. 1–3, and table 1–4 legends.

Reporting Summary

Supplementary Tables

Supplementary Table 1. The list of DEGs between PENCIL-predicted cells associated with immunotherapy outcomes. Supplementary Table 2. The pathways related to CD8+ T-cells are enriched by the significantly downregulated genes in responders compared with non-responders for the cells selected by PENCIL. Supplementary Table 3. The pathways related to CD8+ T-cells are enriched by the significantly upregulated genes in responders compared with non-responders for the cells selected by PENCIL. Supplementary Table 4. The list of the genes whose expression levels significantly depend on the timepoints predicted by the regression-based PENCIL analysis on the MCL scRNA-seq dataset.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, T., Chen, C., Danilov, A.V. et al. Supervised learning of high-confidence phenotypic subpopulations from single-cell data. Nat Mach Intell 5, 528–541 (2023). https://doi.org/10.1038/s42256-023-00656-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-023-00656-y

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics