Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Multi-omics integration in the age of million single-cell data

Abstract

An explosion in single-cell technologies has revealed a previously underappreciated heterogeneity of cell types and novel cell-state associations with sex, disease, development and other processes. Starting with transcriptome analyses, single-cell techniques have extended to multi-omics approaches and now enable the simultaneous measurement of data modalities and spatial cellular context. Data are now available for millions of cells, for whole-genome measurements and for multiple modalities. Although analyses of such multimodal datasets have the potential to provide new insights into biological processes that cannot be inferred with a single mode of assay, the integration of very large, complex, multimodal data into biological models and mechanisms represents a considerable challenge. An understanding of the principles of data integration and visualization methods is required to determine what methods are best applied to a particular single-cell dataset. Each class of method has advantages and pitfalls in terms of its ability to achieve various biological goals, including cell-type classification, regulatory network modelling and biological process inference. In choosing a data integration strategy, consideration must be given to whether the multi-omics data are matched (that is, measured on the same cell) or unmatched (that is, measured on different cells) and, more importantly, the overall modelling and visualization goals of the integrated analysis.

Key points

  • With the development of single-cell multi-omics techniques, tools and models for data integration are critically important.

  • Integration problems in single-cell biology can be divided into those associated with the integration of matched and unmatched data.

  • Strategies for integrating matched data include joint latent space inference, consensus of individual inferences and biological causal modelling.

  • Strategies for integrating unmatched data include annotated group matching, matching with common features and aligning spaces.

  • Visualization methods for integrated multimodal single-cell data are still underdeveloped.

  • Future challenges include accounting for specific noise related to each modality, overcoming the need for computing efficiency and developing biologically interpretable integration strategies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Frameworks for the integration of single-cell multi-omics data.
Fig. 2: Considerations for choosing an integration method for single-cell multi-omics analysis.
Fig. 3: Desired properties and functionalities of visualization tools for single-cell multi-omics.

Similar content being viewed by others

References

  1. Richardson, S., Tseng, G. C. & Sun, W. Statistical methods in integrative genomics. Annu. Rev. Stat. Appl. 3, 181–209 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Yuan, G.-C. et al. Challenges and emerging directions in single-cell analysis. Genome Biol. 18, 84 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Eberwine, J., Sul, J.-Y., Bartfai, T. & Kim, J. The promise of single-cell sequencing. Nat. Methods 11, 25–27 (2014).

    Article  CAS  PubMed  Google Scholar 

  4. Yao, Z. et al. A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation. Preprint at bioRxiv https://doi.org/10.1101/2020.03.30.015214 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Cao, J. et al. A human cell atlas of fetal gene expression. Science 370, eaba7721 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ransick, A. et al. Single-cell profiling reveals sex, lineage, and regional diversity in the mouse kidney. Dev. Cell 51, 399–413.e7 (2019). A comprehensive kidney scRNA-seq atlas with the visualization tool Kidney Cell Explorer.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Kirita, Y., Wu, H., Uchimura, K., Wilson, P. C. & Humphreys, B. D. Cell profiling of mouse acute kidney injury reveals conserved cellular responses to injury. Proc. Natl Acad. Sci. USA 117, 15874–15883 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Kuppe, C. et al. Decoding myofibroblast origins in human kidney fibrosis. Nature 589, 281–286 (2021).

    Article  CAS  PubMed  Google Scholar 

  9. Gerhardt, L. M. S. et al. Single-nuclear transcriptomics reveals diversity of proximal tubule cell states in a dynamic response to acute kidney injury. Proc. Natl Acad. Sci. USA 118, e2026684118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Ma, A., McDermaid, A., Xu, J., Chang, Y. & Ma, Q. Integrative methods and practical challenges for single-cell multi-omics. Trends Biotechnol. 38, 1007–1022 (2020). A comprehensive review of single-cell multi-omics technologies.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Lee, J., Hyeon, D. Y. & Hwang, D. Single-cell multiomics: technologies and data analysis methods. Exp. Mol. Med. 52, 1428–1442 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Sullivan, K. M. & Susztak, K. Unravelling the complex genetics of common kidney diseases: from variants to mechanisms. Nat. Rev. Nephrol. 16, 628–640 (2020). An up-to-date review on efforts to gain further understanding of kidney disease-associated genome-wide association study variants.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Muto, Y. et al. Single cell transcriptional and chromatin accessibility profiling redefine cellular heterogeneity in the adult human kidney. Nat. Commun. 12, 2190 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Miao, Z. et al. Single cell regulatory landscape of the mouse kidney highlights cellular differentiation programs and disease targets. Nat. Commun. 12, 2277 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Gorin, G., Svensson, V. & Pachter, L. Protein velocity and acceleration from single-cell multiomics experiments. Genome Biol. 21, 39 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Peterson, V. M. et al. Multiplexed quantification of proteins and transcripts in single cells. Nat. Biotechnol. 35, 936–939 (2017).

    Article  CAS  PubMed  Google Scholar 

  19. Zhou, Z., Ye, C., Wang, J. & Zhang, N. R. Surface protein imputation from single cell transcriptomes by deep neural networks. Nat. Commun. 11, 651 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871.e8 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Serra, A., Fratello, M., Greco, D. & Tagliaferri, R. Data integration in genomics and systems biology. in 2016 IEEE Congress on Evolutionary Computation (CEC) 1272–1279 (IEEE, 2016).

  22. Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease. Genome Biol. 18, 83 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Liu, L. et al. Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity. Nat. Commun. 10, 470 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Dueck, H. et al. Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation. Genome Biol. 16, 122 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Dueck, H. R. et al. Assessing characteristics of RNA amplification methods for single cell RNA sequencing. BMC Genomics 17, 966 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Zhu, C. et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol. 26, 1063–1070 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Ma, S. et al. Chromatin potential identified by shared single cell profiling of RNA and chromatin. Preprint at bioRxiv https://doi.org/10.1101/2020.06.17.156943 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Han, S. H., Choi, Y., Kim, J. & Lee, D. Photoactivated selective release of droplets from microwell arrays. ACS Appl. Mater. Interfaces 12, 3936–3944 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Stuart, T. & Satija, R. Integrative single-cell analysis. Nat. Rev. Genet. 20, 257–272 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Li, Y., Ma, L., Wu, D. & Chen, G. Advances in bulk and single-cell multi-omics approaches for systems biology and precision medicine. Brief. Bioinform. https://doi.org/10.1093/bib/bbab024 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Sokal, R. R. Distance as a measure of taxonomic similarity. Syst. Biol. 10, 70–79 (1961).

    Google Scholar 

  34. Sneath, P. H. A. & Sokal, R. R. Numerical Taxonomy: The Principles and Practice of Numerical Classification (WF Freeman, 1973).

  35. Wang, X. et al. BREM-SC: a Bayesian random effects mixture model for joint clustering single cell multi-omics data. Nucleic Acids Res. 48, 5814–5824 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Jin, S., Zhang, L. & Nie, Q. scAI: an unsupervised approach for the integrative analysis of parallel single-cell transcriptomic and epigenomic profiles. Genome Biol. 21, 25 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Argelaguet, R. et al. Multi-omics factor analysis — a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 14, e8124 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Argelaguet, R. et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol. 21, 111 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Gayoso, A. et al. Joint probabilistic modeling of single-cell multi-omic data with totalVI. Nat. Methods 18, 272–282 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Wang, B. et al. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337 (2014). This paper introduces the similarity network fusion model, which is widely applied in multi-omics integration.

    Article  CAS  PubMed  Google Scholar 

  42. Kim, H. J., Lin, Y., Geddes, T. A., Yang, J. Y. H. & Yang, P. CiteFuse enables multi-modal analysis of CITE-seq data. Bioinformatics 36, 4137–4143 (2020).

    Article  CAS  PubMed  Google Scholar 

  43. Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).

    Article  CAS  PubMed  Google Scholar 

  44. Packer, J. S. et al. A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution. Science 365, eaax1971 (2019). A single-cell atlas of Caenorhabditis elegans with the visualization tool visCello.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Slavov, N. Single-cell protein analysis by mass spectrometry. Curr. Opin. Chem. Biol. 60, 1–9 (2021).

    Article  CAS  PubMed  Google Scholar 

  47. Neumann, E. K., Ellis, J. F., Triplett, A. E., Rubakhin, S. S. & Sweedler, J. V. Lipid analysis of 30000 individual rodent cerebellar cells using high-resolution mass spectrometry. Anal. Chem. 91, 7871–7878 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Zhu, Q. et al. Developmental trajectory of prehematopoietic stem cell formation from endothelium. Blood 136, 845–856 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Wang, C. et al. Integrative analyses of single-cell transcriptome and regulome using MAESTRO. Genome Biol. 21, 198 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Asp, M., Bergenstråhle, J. & Lundeberg, J. Spatially resolved transcriptomes—next generation tools for tissue exploration. BioEssays 42, 1900221 (2020).

    Article  Google Scholar 

  51. Zhu, Q., Shah, S., Dries, R., Cai, L. & Yuan, G.-C. Identification of spatially associated subpopulations by combining scRNAseq and sequential fluorescence in situ hybridization data. Nat. Biotechnol. 36, 1183–1190 (2018).

    Article  CAS  Google Scholar 

  52. Rodriques, S. G. et al. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Andersson, A. et al. Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography. Commun. Biol. 3, 565 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Govek, K. W. et al. Single-cell transcriptomic analysis of mIHC images via antigen mapping. Sci. Adv. 7, eabc5464 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018). This paper introduces the MNN method that became popular in single-cell biology with multiple applications.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Campbell, K. R. et al. clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers. Genome Biol. 20, 54 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887.e17 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Yang, Z. & Michailidis, G. A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics 32, 1–8 (2016).

    Article  PubMed  Google Scholar 

  60. Amodio, M. & Krishnaswamy, S. MAGAN: aligning biological manifolds. Proc. Machine Learn. Res. 80, 215–223 (2018).

    Google Scholar 

  61. Welch, J. D., Hartemink, A. J. & Prins, J. F. MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics. Genome Biol. 18, 138 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. Liu, J., Huang, Y., Singh, R., Vert, J.-P. & Noble, W. S. in 19th International Workshop on Algorithms in Bioinformatics (WABI 2019) (eds Huber, K. T. & Gusfield, D.) Vol. 143 10:1–10:13 (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019).

  63. Cao, K., Bai, X., Hong, Y. & Wan, L. Unsupervised topological alignment for single-cell multi-omics integration. Bioinformatics 36, i48–i56 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Demetci, P., Santorella, R., Sandstede, B., Noble, W. S. & Singh, R. Gromov-Wasserstein optimal transport to align single-cell multi-omics data. Preprint at bioRxiv https://doi.org/10.1101/2020.04.28.066787 (2020).

    Article  Google Scholar 

  65. Li, X. et al. Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11, 2338 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at arxiv https://arxiv.org/abs/1803.00385 (2020).

  67. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Costa, F., Grün, D. & Backofen, R. GraphDDP: a graph-embedding approach to detect differentiation pathways in single-cell-data using prior class knowledge. Nat. Commun. 9, 3685 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Wu, Y. & Zhang, K. Tools for the analysis of high-dimensional single-cell RNA sequencing data. Nat. Rev. Nephrol. 16, 408–421 (2020). A comprehensive review of scRNA-seq data analysis pipelines and computational tools.

    Article  PubMed  Google Scholar 

  70. Steiniger, S. & Hay, G. J. Free and open source geographic information tools for landscape ecology. Ecol. Inform. 4, 183–195 (2009).

    Article  Google Scholar 

  71. Raney, B. J. et al. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC genome browser. Bioinformatics 30, 1003–1005 (2014).

    Article  CAS  PubMed  Google Scholar 

  72. Ou, J. & Zhu, L. J. trackViewer: a bioconductor package for interactive and integrative visualization of multi-omics data. Nat. Methods 16, 453–454 (2019).

    Article  CAS  PubMed  Google Scholar 

  73. Snyder, M. P. et al. The human body at cellular resolution: the NIH human biomolecular atlas program. Nature 574, 187–192 (2019).

    Article  CAS  Google Scholar 

  74. Hillje, R., Pelicci, P. G. & Luzi, L. Cerebro: interactive visualization of scRNA-seq data. Bioinformatics 36, 2311–2313 (2020).

    Article  CAS  PubMed  Google Scholar 

  75. Dries, R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 22, 78 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Larsson, A. J. M. et al. Genomic encoding of transcriptional burst kinetics. Nature 565, 251–254 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Chakrabarti, S. et al. Hidden heterogeneity and circadian-controlled cell fate inferred from single cell lineages. Nat. Commun. 9, 5372 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Zhong, L. et al. Single cell transcriptomics identifies a unique adipose lineage cell population that regulates bone marrow environment. eLife 9, e54695 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Lahens, N. F. et al. IVT-seq reveals extreme bias in RNA sequencing. Genome Biol. 15, R86 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  80. Marquina-Sanchez, B. et al. Single-cell RNA-seq with spike-in cells enables accurate quantification of cell-specific drug effects in pancreatic islets. Genome Biol. 21, 106 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Xi, N. M. & Li, J. J. Benchmarking computational doublet-detection methods for single-cell RNA sequencing data. Cell Syst. 12, 176–194.e6 (2021).

    Article  CAS  PubMed  Google Scholar 

  82. Franzosa, E. A. et al. Gut microbiome structure and metabolic activity in inflammatory bowel disease. Nat. Microbiol. 4, 293–305 (2019).

    Article  CAS  PubMed  Google Scholar 

  83. Tini, G., Marchetti, L., Priami, C. & Scott-Boyer, M.-P. Multi-omics integration — a comparison of unsupervised clustering methodologies. Brief. Bioinform. 20, 1269–1279 (2019).

    Article  CAS  PubMed  Google Scholar 

  84. Pierson, E. & Yau, C. ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 241 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Marinov, G. K. et al. From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing. Genome Res. 24, 496–510 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Zhang, L. & Nie, Q. scMC learns biological variation through the alignment of multiple single-cell genomics datasets. Genome Biol. 22, 10 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Fang, R. et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat. Commun. 12, 1337 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Velleman, P. F. & Wilkinson, L. Nominal, ordinal, interval, and ratio typologies are misleading. Am. Stat. 47, 65–72 (1993).

    Google Scholar 

  90. He, B. et al. Integrating spatial gene expression and breast tumour morphology via deep learning. Nat. Biomed. Eng. 4, 827–834 (2020).

    Article  CAS  PubMed  Google Scholar 

  91. Wu, H., Kirita, Y., Donnelly, E. L. & Humphreys, B. D. Advantages of single-nucleus over single-cell RNA sequencing of adult kidney: rare cell types and novel cell states revealed in fibrosis. J. Am. Soc. Nephrol. 30, 23–32 (2019).

    Article  CAS  PubMed  Google Scholar 

  92. Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. McGinnis, C. S. et al. MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices. Nat. Methods 16, 619–626 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Yang, K. D. et al. Multi-domain translation between single-cell imaging and sequencing data using autoencoders. Nat. Commun. 12, 31 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Dhillon, P. et al. The nuclear receptor ESRRA protects from kidney disease by coupling metabolism and differentiation. Cell Metab. 33, 379–394.e8 (2021).

    Article  CAS  PubMed  Google Scholar 

  96. Sheng, X. et al. Systematic integrated analysis of genetic and epigenetic variation in diabetic kidney disease. Proc. Natl Acad. Sci. USA 117, 29013–29024 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Wu, P.-H. et al. Single-cell morphology encodes metastatic potential. Sci. Adv. 6, eaaw6938 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Lindström, N. O. et al. Spatial transcriptional mapping of the human nephrogenic program. Preprint at bioRxiv https://doi.org/10.1101/2020.04.27.060749 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Khaladkar, M. et al. Subcellular RNA sequencing reveals broad presence of cytoplasmic intron-sequence retaining transcripts in mouse and rat neurons. PLoS ONE 8, e76194 (2013). The first subcellular RNA sequencing method.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported in part by UC2DK126024 grant to J.K., B.D.H. and A.P.M. as well as by a Health Research Formula Fund of the Commonwealth of Pennsylvania, which did not have a direct role in the work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhyong Kim.

Ethics declarations

Competing interests

A.P.M. is a scientific adviser to Novartis, eGENESIS, TRESTLE Therapeutics and IVIVA Medical. The other authors declare no competing interests.

Additional information

Peer review information

Nature Reviews Nephrology thanks B. J. Aronow, Q. Nie and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

GenitoUrinary Development Molecular Anatomy Project: https://www.gudmap.org/

HuBMAP portal: https://portal.hubmapconsortium.org/

KidneyCellExplorer: https://cello.shinyapps.io/kidneycellexplorer/

ReBuilding a Kidney: https://www.rebuildingakidney.org/

Glossary

Assay for transposase-accessible chromatin using sequencing

(ATAC-seq). A technique that profiles the accessibility of DNA elements based on the principle that the Tn5 transposase can insert a transposon only at accessible parts of the chromosome. The insertion location is identified through DNA sequencing.

Cis-regulatory elements

DNA elements proximal to a gene that are required for controlling gene expression. Such elements usually include promoters and enhancers, and often contain transcription factor-binding sites.

Molecule recovery efficiency

Single-cell assays capture molecules, such as mRNAs or transposon-interrupted DNA fragments, and amplify them for readout. Different protocols recover a given pool of molecules with different efficiencies; for example, a single podocyte might have 300,000 mRNA molecules and an RNA sequencing protocol with a 10% recovery efficiency would recover ~30,000 of these.

Joint snRNA-seq and snATAC-seq

Single-cell RNA sequencing (scRNA-seq) attempts to recover RNA from the whole cell, whereas single-nucleus RNA sequencing (snRNA-seq) only isolates the nuclear fraction of the RNA; the two transcriptomes are related but different. Multi-omics methods involving assay for transposase-accessible chromatin using sequencing (ATAC-seq) and RNA-seq typically isolate the nucleus first, resulting in snRNA-seq and snATAC-seq.

Feature space

In machine learning, measured variables are often called features and the set of features comprise a feature space.

Sequential fluorescence in situ hybridization

(seqFISH). A technique that measures mRNA quantity through sequential fluorescent probes that have combinatorially encoded information for each targeted mRNA. For example, a sequence signal, probe A then B, might encode gene X, whereas the sequence probe A then C might encode gene Y.

Read depth

A quantity that measures the number of times that sequencing reads cover a given genomic region. The region of interest may be a base pair or an entire transcribed region.

Canonical correlation analysis

A multivariate statistical technique that computes the correlation between two sets of variables, say X and Y. Canonical correlation analysis finds the linear combination of X and the linear combination of Y that maximizes correlation.

Non-negative matrix factorization

A group of algorithms that decompose one matrix into a product of two (or more) matrices, such that the elements in each matrix are non-negative. Typically, each matrix has a model interpretation; for example, a data matrix factorizes the matrix into one representing latent space features and another representing latent space features to cells.

Metagenes

A metagene is some (mathematical) function of a group of genes (for example, linear combination), often relating some shared properties. For example, methods like non-negative matrix factorization compute matrices as the product between a gene-by-metagene matrix and a metagene-by-cell matrix.

Dimension reduction

A data transformation method that reduces the number of dimensions in the original feature space to a lower-dimensional space (usually much lower than the original one) while certain properties (for example, the distance measures between observations) of the original data are preserved.

Pseudotime

In contrast to real time, pseudotime represents computationally inferred temporal stages of a collection of cells.

Principal component analysis

A common dimension reduction method that aims to project the original data to a fixed smaller dimension while minimizing the squared error during data reduction. This approach can be viewed as maximizing the variance in the projected data.

Embedding

In mathematics, embedding is a map from one set X to another set Y, where some characteristic of X is preserved. In single-cell studies, the term embedding has been used for methods that ‘place’ cells in a new feature space, possibly of a lower dimension, such that notions of cell-to-cell distances are approximately preserved.

Dropouts

In single-cell biology, dropouts are usually the transcripts that were present in the cell but were not captured during sequencing.

Ambient RNA

In droplet-based single-cell RNA sequencing approaches, the measured mRNA molecules can be contaminated by mRNAs from other cells present in the suspension, for example, owing to cell rupture. These contaminating mRNAs are termed ambient RNA.

Multiplets

During high-throughput single-cell (or single-nucleus) isolation in droplets or similar vessels, two or more cells might be captured together creating a mixture of molecules. Computational methods have been developed to detect and remove such unwanted observations from the dataset.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miao, Z., Humphreys, B.D., McMahon, A.P. et al. Multi-omics integration in the age of million single-cell data. Nat Rev Nephrol 17, 710–724 (2021). https://doi.org/10.1038/s41581-021-00463-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41581-021-00463-x

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research