Abstract
Rigorously comparing gene expression and chromatin accessibility in the same single cells could illuminate the logic of how coupling or decoupling of these mechanisms regulates fate commitment. Here we present MIRA, probabilistic multimodal models for integrated regulatory analysis, a comprehensive methodology that systematically contrasts transcription and accessibility to infer the regulatory circuitry driving cells along cell state trajectories. MIRA leverages topic modeling of cell states and regulatory potential modeling of individual gene loci. MIRA thereby represents cell states in an efficient and interpretable latent space, infers high-fidelity cell state trees, determines key regulators of fate decisions at branch points and exposes the variable influence of local accessibility on transcription at distinct loci. Applied to epidermal differentiation and embryonic brain development from two different multimodal platforms, MIRA revealed that early developmental genes were tightly regulated by local chromatin landscape whereas terminal fate genes were titrated without requiring extensive chromatin remodeling.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The authors of the SHARE-seq skin study3 provide the RNA-seq count matrix at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4156608 and the ATAC-seq peak count matrix at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4156597. 10X Genomics provides the brain dataset14 RNA-seq count matrix and ATAC-seq peak count matrix at https://www.10xgenomics.com/resources/datasets/fresh-embryonic-e-18-mouse-brain-5-k-1-standard-2-0-0. RNA-seq and ATAC-seq count matrices used for the benchmarking study may be found at https://www.10xgenomics.com/resources/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-2-0-0.
Code availability
MIRA is available as a Python package at https://github.com/cistrome/MIRA. Frankencell, a Python program we developed to generate synthetic differentiation trajectories for benchmarking, is available at https://github.com/AllenWLynch/frankencell-dynverse.
References
Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).
Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).
Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116 (2020).
Zhu, C. et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol. 26, 1063–1070 (2019).
Duren, Z., Chen, X., Xin, J., Wang, Y. & Wong, W. H. Time course regulatory analysis based on paired expression and chromatin accessibility data. Genome Res. 30, 622–634 (2020).
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
Gong, B., Zhou, Y. & Purdom, E. Cobolt: joint analysis of multimodal single-cell sequencing data. Genome Biol. 22, 351 (2021).
Minoura, K., Abe, K., Nam, H., Nishikawa, H. & Shimamura, T. A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. Cell Rep. Methods 1, 100071 (2021).
Chen, H., Ryu, J., Vinyard, M., Lerer, A. & Pinello, L. SIMBA: single-cell embedding along with features. Preprint at bioRxiv https://doi.org/10.1101/2021.10.17.464750 (2021).
Lin, Y. et al. scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning. Nat. Biotechnol. 40, 703–710 (2022).
Duren, Z. et al. Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations. Proc. Natl Acad. Sci. USA 115, 7723–7728 (2018).
Lara-Astiaso, D. et al. Chromatin state dynamics during blood formation. Science 345, 943–949 (2014).
Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).
10X Genomics Datasets (10X Genomics, 2022); https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets
Blei, D. M. Probabilistic topic models. Commun. ACM 55, 77–84 (2012).
Zhao, Y., Cai, H., Zhang, Z., Tang, J. & Li, Y. Learning interpretable cellular and gene signature embeddings from single-cell transcriptomic data. Nat. Commun. 12, 5261 (2021).
Bravo González-Blas, C. et al. cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data. Nat. Methods 16, 397–400 (2019).
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).
Blei, D. M. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003).
Wang, S. et al. Target analysis by integration of transcriptome and ChIP-seq data with BETA. Nat. Protoc. 8, 2502–2515 (2013).
Qin, Q. et al. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Genome Biol. 21, 32 (2020).
Schneider, M. R., Schmidt-Ullrich, R. & Paus, R. The hair follicle as a dynamic miniorgan. Curr. Biol. 19, R132–R142 (2009).
Blanpain, C. & Fuchs, E. Epidermal homeostasis: a balancing act of stem cells in the skin. Nat. Rev. Mol. Cell Biol. 10, 207–217 (2009).
Byron, L. & Wattenberg, M. Stacked graphs – geometry & aesthetics. IEEE Trans. Vis. Comput. Graph. 14, 1245–1252 (2008).
Soma, T., Ogo, M., Suzuki, J., Takahashi, T. & Hibino, T. Analysis of apoptotic cell death in human hair follicles in vivo and in vitro. J. Invest. Dermatol. 111, 948–954 (1998).
Cui, C.-Y. et al. Ectodysplasin regulates the lymphotoxin-beta pathway for hair differentiation. Proc. Natl Acad. Sci. USA 103, 9142–9147 (2006).
Pan, Y. et al. gamma-secretase functions through Notch signaling to maintain skin appendages but is not required for their patterning or initial morphogenesis. Dev. Cell 7, 731–743 (2004).
Genander, M. et al. BMP signaling and its pSMAD1/5 target genes differentially regulate hair follicle stem cell lineages. Cell Stem Cell 15, 619–633 (2014).
Joost, S. et al. Single-Cell transcriptomics reveals that differentiation and spatial signatures shape epidermal and hair follicle heterogeneity. Cell Syst. 3, 221–237 (2016).
Grose, R., Harris, B. S., Cooper, L., Topilko, P. & Martin, P. Immediate early genes krox-24 and krox-20 are rapidly up-regulated after wounding in the embryonic and adult mouse. Dev. Dyn. 223, 371–378 (2002).
Hildesheim, J. et al. The hSkn-1a POU transcription factor enhances epidermal stratification by promoting keratinocyte proliferation. J. Cell Sci. 114, 1913–1923 (2001).
Zeitvogel, J. et al. GATA3 regulates FLG and FLG2 expression in human primary keratinocytes. Sci. Rep. 7, 111847 (2017).
Hernández-Miranda, L. R., Parnavelas, J. G. & Chiara, F. Molecules and mechanisms involved in the generation and migration of cortical interneurons. ASN Neuro 2, e00031 (2010).
La Manno, G. et al. Molecular architecture of the developing mouse brain. Nature 596, 92–96 (2021).
Di Bella, D. J. et al. Molecular logic of cellular diversification in the mouse cerebral cortex. Nature 595, 554–559 (2021).
Esther, L.-B. et al. in GABA And Glutamate: New Developments In Neurotransmission Research 25 (InTech, 2018).
Yang, N. et al. Generation of pure GABAergic neurons by transcription factor programming. Nat. Methods 14, 621–628 (2017).
Raposo, A. A. S. F. et al. Ascl1 coordinately regulates gene expression and the chromatin landscape during neurogenesis. Cell Rep. 10, 1544–1556 (2015).
de Martin, X., Sodaei, R. & Santpere, G. Mechanisms of binding specificity among bHLH transcription factors. Int. J. Mol. Sci. 22, 9150 (2021).
Porcher, C., Medina, I. & Gaiarsa, J.-L. Mechanism of BDNF modulation in GABAergic synaptic transmission in healthy and disease brains. Front. Cell. Neurosci. 12, 273 (2018).
Mo, J. et al. Early growth response 1 (Egr-1) directly regulates GABAA receptor α2, α4, and θ subunits in the hippocampus. J. Neurochem. 133, 489–500 (2015).
Sheng, Z.-H. & Cai, Q. Mitochondrial transport in neurons: impact on synaptic homeostasis and neurodegeneration. Nat. Rev. Neurosci. 13, 77–93 (2012).
Harrington, A. J. et al. MEF2C regulates cortical inhibitory and excitatory synapses and behaviors relevant to neurodevelopmental disorders. eLife 5, e20059 (2016).
Park, N. I. et al. ASCL1 reorganizes chromatin to direct neuronal fate and suppress tumorigenicity of glioblastoma stem cells. Cell Stem Cell 21, 411 (2017).
Chen, C.-H. et al. Determinants of transcription factor regulatory range. Nat. Commun. 11, 2472 (2020).
Tritschler, S. et al. Concepts and limitations for learning developmental trajectories from single cell genomics. Development 146, dev170506 (2019).
Wagner, D. E. & Klein, A. M. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 21, 410–427 (2020).
Blei, D. M., Ng, A. Y. & Jordan, M. I. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
Choi, K., Chen, Y., Skelly, D. A. & Churchill, G. A. Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics. Genome Biol. 21, 183 (2020).
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 14, 128 (2013).
Fisher, R. A. On the Interpretation of χ2 from contingency tables, and the calculation of P. J. R. Stat. Soc. 85, 87 (1922).
Srivastava, A. & Sutton, C. Autoencoding variational inference for topic models. In Proc. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proc. (Cornell Univ., 2017).
Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G. & Barceló-Vidal, C. Isometric logratio transformations for compositional data analysis. Math. Geol. 35, 279–300 (2003).
Silverman, J. D., Washburne, A. D., Mukherjee, S. & David, L. A. A phylogenetic transform enhances analysis of compositional microbiota data. eLife 6, e21887 (2017).
Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).
McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at arXiv https://doi.org/10.48550/arXiv.1802.03426 (2018).
Setty, M. et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol. 37, 451–460 (2019).
Chen, C. H. et al. Determinants of transcription factor regulatory range. Nat. Commun. 11, 2472 (2020).
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Yadav, A., Goldstein, T. & Jacobs, D. Making L-BFGS work with industrial-strength nets. in Proc. 31st The British Machine Vision Conference (BMVC) 7–10 September 2020 (BMVA, 2020).
Pearson, E. S. & Naymon, J. On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika 20, 275–240 (1928).
10X Genomics Datasets (10X Genomics) (accessed February 2022); https://www.10xgenomics.com/resources/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-2-0-0
Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019).
Acknowledgements
We thank the X.S. Liu laboratory members, M. Oser and K. Wucherpfennig for helpful scientific discussions. This work was supported by the National Institutes of Health (NIH) grant no. U24 CA237617 to C.A.M. C.V.T. was supported by the Helen Hay Whitney Foundation Postdoctoral Fellowship and grant no. NIH T32GM007748.
Author information
Authors and Affiliations
Contributions
A.W.L. developed MIRA, designed analyses and analyzed the SHARE-seq dataset. C.V.T. codeveloped MIRA, designed analyses and analyzed the 10X Genomics dataset. H.W.L. and M.B. contributed to analysis design. X.S.L. and C.A.M. designed analyses and supervised the work. A.W.L., C.V.T., X.S.L. and C.A.M. wrote the manuscript. A.W.L. and C.A.M. originated the work. All authors edited and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
M.B. is a consultant to and receives sponsored research support from Novartis. M.B. serves on the SAB of H3 Biomedicine, Kronos Bio and GV20 Oncotherapy. X.S.L. conducted the work while being on the faculty at the Dana Farber Cancer Institute and is currently a board member and CEO of GV20 Therapeutics. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Eran Mukamel, Fangming Xie and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Overview of MIRA topic model architecture.
a, The MIRA topic model uses a variational autoencoder (VAE) approach to learn stochastic mappings between observations in X-space, gene-counts or peak-counts in a cell, which are high-dimensional and noisy, and a simpler latent Z-space or topic space, which exists on the simplex basis with a Dirichlet prior. (bottom right) The generative model relates the observations X to the estimated composition 𝞺 over features (genes or peaks), sampling a negative binomial distribution for RNA counts and a multinomial distribution for ATAC peaks. (top right) The composition over features is given by the topic matrix 𝜷 encoding topic-feature associations and the latent topics Z of a cell, which are sampled from the distribution qφ(Z|X), the variational approximation of p𝜗(Z|X). (top left) The distribution of Z is parameterized by 𝞵 and 𝞼², outputs from the encoder neural network given the X-space observations as inputs. (bottom left) The encoder neural network for RNA data performs deviance residual featurization of counts which are passed through feed-forward layers. The ATAC data encoder passes binarized peak accessibility features through a deep averaging network. (Illustration adapted from Kingma and Welling, Foundations and Trends in Machine Learning, 2019). b, Ratio of probability of medulla fate commitment versus cortex commitment of each cell in the hair follicle, arranged by pseudotime. MIRA defines branch points between cell states where probabilities of differentiating into one terminal state diverges from another. c, MIRA joint representation UMAP colored by ratio of probability of medulla fate commitment within the ORS, matrix, medulla, and cortex populations. Differentiation in the hair follicle proceeds from ORS to progenitor matrix cells, which then specify into the medulla or cortex fate. (IRS cells indicated in black are not included in this trajectory).
Extended Data Fig. 2 MIRA outperforms standard methodology for resolving cell state trajectories using expression data alone.
Benchmarking results comparing MIRA to standard methodology of Seurat PCA + Slingshot in the indicated metrics of cell state trajectory inference using expression data alone. Top row shows ground truth scaffolds, which are computationally synthesized by mixing reads from distinct populations of single cells from a 10X Genomics dataset63 of peripheral blood mononuclear cells (PBMCs). Scaffold difficulty increases from left to right, where more difficult scaffolds contain cell states where mixture components are more similar (increased entropy), making them more difficult to distinguish by the tested lineage inference methodologies. Line plots indicate MIRA (red) versus Seurat PCA + Slingshot (blue) performance in each of the four scaffold difficulties with trials for three different mean read depths (lower read depth further increases the difficulty of solving the topology). For each trial, 5 replicates were tested for each modeling approach. Edge accuracy measures the accuracy of the inferred edges compared to ground truth (dynverse’s edge flip score64). Branch F1 score64 measures the precision and recall of the inferred branches compared to ground truth. Pseudotime correlation64 measures the correlation between inferred versus ground truth pseudotime for each cell. The bottom rows show example UMAPs for MIRA or Seurat PCA + Slingshot for each scaffold difficulty with black edges showing cell state parsing from each algorithm. Cells colored by ground truth branch assignment where blue cells are the origin state. In the line plots above, black outlines indicate the points for the models shown in the example UMAPs.
Extended Data Fig. 3 MIRA outperforms standard methodology for resolving cell state trajectories using accessibility data alone.
Benchmarking results comparing MIRA to standard methodology of Seurat LSI + Slingshot in the indicated metrics of cell state trajectory inference using accessibility data alone. Top row shows ground truth scaffolds with scaffold difficulty increasing from left to right. No models solved the topology of the most difficult scaffold using accessibility alone so metric comparisons are shown for the other three scaffolds. See Extended Data Fig. 3 for description of metrics.
Extended Data Fig. 4 MIRA outperforms standard methodology for resolving cell state trajectories using both expression and accessibility data jointly.
Benchmarking results comparing MIRA joint representation to standard methodology of joint representation combining Seurat PCA of expression data and Seurat LSI of accessibility data followed by Slingshot. See Extended Data Fig. 3 for description of metrics. For expression data, mean read depth n = 4000; for accessibility data, mean read depth n = 14000.
Extended Data Fig. 5 MIRA topics describing hair follicle cells were sparse and nonredundant.
a, UMAP based on standard methodology versus MIRA topic modeling for expression or accessibility. Standard PCA-based representation of expression shows matrix population as shifted away from its predecessor ORS and descendant IRS, medulla, and cortex cells. However, MIRA topic modeling of expression appropriately represents matrix cells as an intermediate population between the aforementioned lineages. Standard LSI-based representation of accessibility shows ORS cells interjected between matrix and its descendant IRS and shows medulla situated between two separate cortex populations. Conversely, MIRA topic modeling of accessibility appropriately represents matrix cells as continuous with its descendant IRS and better separates medulla and cortex into two distinct branches. b, MIRA joint topic representation of expression and accessibility. In (a-b), colors demonstrate expression of marker genes of indicated lineages. c, MIRA expression topics e1-6 and d, MIRA accessibility topics a1-7 on joint representation UMAP. In (c-d), colored boxes correspond to topic colors as on stream graphs in Fig. 2c and Extended Data Fig. 7a.
Extended Data Fig. 6 MIRA topics described gene modules activated in each lineage.
a, Stream graph of window-averaged cell-topic compositions starting from ORS cell state, progressing rightward through pseudotime (to facilitate visualization of all lineages concurrently, pseudotime scale is not log-transformed, unlike other presented stream graphs). b, MIRA joint topic representation colored by expression of genes highly activated in each of the indicated topics, which described the activated gene modules in each lineage. c, MIRA joint topic representation colored by indicated motif scores.
Extended Data Fig. 7 Terminal medulla and cortex cells showed significantly higher NITE regulation compared to cells earlier in hair follicle differentiation.
a, MIRA joint topic representation colored by expression of Hoxc genes, indicating that Hoxc motifs activated in both the medulla and cortex accessibility topics (a5 and a6, respectively) were most attributable to Hoxc13 based on its expression in these lineages. b, Correlation matrix between expression and accessibility topics. While some topics had a clear one-to-one correlation between modalities (for example expression topic e1 with accessibility topic a1), others did not strongly correlate with a single topic from the opposing modality (for example branch accessibility topic a4). c, Comparison of motif enrichment in top peaks of preceding matrix versus subsequent branch accessibility topics (a2 and a4, respectively). While most motifs were shared between these topics, accessibility of Wnt signaling-related motifs uniquely arose at the branch. d, Distribution of NITE scores among genes expressed in the hair follicle. Scores of example LITE gene Braf and NITE gene Krt23 are indicated by arrows. e, LITE gene Braf as shown in Fig. 3c but extended to include further downstream region. As described in Fig. 3c, plot shows chromatin accessibility fragments across pseudotime (moving downwards) in trajectories from ORS to matrix to cortex or medulla. Colored bars on the right indicate the identity of cells (colored by clusters in Fig. 2a) within each bin reflected by each row of accessibility fragments. Line plots across pseudotime depict the indicated gene’s observed expression (red) and LITE model prediction of expression (black), which is informed by the local accessibility reflected in the fragment plot. f, Medulla and cortex cells showed significantly more NITE regulation than other cells in the hair follicle (data are presented as mean values +/− standard deviation; rest n = 4565, cortex/medulla n = 1607; *p < 0.05 (1.4e-13), two-sided Wilcoxon rank-sum). g, Genes ultimately expressed in medulla or cortex that were primed at the branch were defined as those with a NITE regulation score above the indicated thresholds that had positive chromatin differential at the branch, indicating that expression was overestimated based on local chromatin accessibility. Branch-primed genes must also be upregulated in the downstream lineage relative to matrix cells. h, Driver transcription factor analysis of non-primed medulla versus cortex genes.
Extended Data Fig. 8 MIRA expression topics describing IFE cells captured shared and lineage-specific states.
a, Expression of marker genes of indicated lineages on MIRA expression, accessibility, and joint topic UMAPs. b, MIRA expression topics e1-13 on joint representation UMAP.
Extended Data Fig. 9 MIRA accessibility topics describing IFE cells captured shared and lineage-specific states.
a, MIRA accessibility topics a1-15 on joint representation UMAP. Colored boxes correspond to topics indicated in Fig. 5h, which are shared or lineage-specific within the basal-spinous-granular or intermediate basal-spinous-granular differentiation trajectories as annotated in Fig. 5a,b. b, Thbs1 and c, Egr2 expression distinguished basal cells distant from the hair follicle from those within the intermediate basal-spinous-granular trajectory near the hair follicle (*p < 0.05, two-sided Wilcoxon rank-sum, Benjamini-Hochberg corrected).
Extended Data Fig. 10 Terminal granular cells were enriched for NITE regulation.
a, Stream graph of expression topic compositions of basal-spinous-granular (top) and intermediate basal-spinous-granular (bottom) lineages. b, Terminal IFE granular cells showed significantly more NITE regulation than cells earlier in the differentiation trajectory (basal and spinous cells) (data are presented as mean values +/− standard deviation; basal and spinous n = 10850, granular n = 1596; *p < 0.05 (1.5e-15), two-sided Wilcoxon rank-sum). c, Genes upregulated in granular cells that were differentially-expressed between granular populations had significantly higher NITE scores than other genes (data are presented as mean values +/− standard deviation; rest n = 4641, terminal and differentially-expressed granular genes n = 241; *p < 0.05 (0.041), two-sided Wilcoxon rank-sum). d, Examples of terminally upregulated, differentially-expressed granular genes’ local chromatin accessibility (LITE model prediction) and expression. Despite accessibility increasing in both lineages, expression only increased in one lineage. e, Mef2c was more highly expressed in excitatory neurons, indicating that Mef2 motifs enriched in the terminal excitatory neuron topic were likely attributable to Mef2c. f, Stream graphs of expression topics across cells state trajectory colored by NITE versus LITE regulation of the top genes in each topic. Topics describing earlier states tended towards LITE regulation with the notable exception of topic e3, which is composed of cell cycle genes that have been previously described to be regulated with minimal influence of local chromatin accessibility state3. Topics describing terminal states tended more towards NITE regulation, including the major terminal excitatory and inhibitory neuron topics that are composed of neurotransmitter genes. Overall, expression topics describing the excitatory and inhibitory progenitor states (labeled mixed progenitor) were significantly enriched for LITE regulation, whereas after commitment to either the excitatory or inhibitory fate, topics were significantly enriched for NITE regulation (*p < 0.05, two-sided Wilcoxon rank-sum, Benjamini-Hochberg corrected). g, Genes predicted by MIRA pISD modeling to be regulated by pioneer transcription factor Ascl1 showed significantly more LITE regulation compared to genes predicted to be regulated by non-pioneer-like Egr1 (data are presented as mean values +/− standard deviation; n = 200; *p < 0.05 (0.0464), two-sided Wilcoxon rank-sum).
Supplementary information
Supplementary Information
Supplementary Figs. 1–5 and Information.
Supplementary Tables
Supplementary Table 1 (T1) Gene set enrichments of each MIRA expression topic in the hair follicle dataset. Indicated P values by one-sided Fisher’s exact test; adjusted P values are Enrichr z-scores corrected for multiple comparisons. Supplementary Table 2 (T2) Motif enrichments of each MIRA accessibility topic in the hair follicle dataset. Supplementary Table 3 (T3) Gene set enrichments of each MIRA expression topic in the IFE dataset. Indicated P values by one-sided Fisher’s exact test; adjusted P values are Enrichr z-scores corrected for multiple comparisons. Supplementary Table 4 (T4) Motif enrichments of each MIRA accessibility topic in the IFE dataset. Supplementary Table 5 (T5) Gene set enrichments of each MIRA expression topic in the embryonic brain dataset. Indicated P values by one-sided Fisher’s exact test; adjusted P values are Enrichr z-scores corrected for multiple comparisons. Supplementary Table 6 (T6) Motif enrichments of each MIRA accessibility topic in the embryonic brain dataset.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lynch, A.W., Theodoris, C.V., Long, H.W. et al. MIRA: joint regulatory modeling of multimodal expression and chromatin accessibility in single cells. Nat Methods 19, 1097–1108 (2022). https://doi.org/10.1038/s41592-022-01595-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-022-01595-z
This article is cited by
-
A fast, scalable and versatile tool for analysis of single-cell omics data
Nature Methods (2024)
-
Multi-batch single-cell comparative atlas construction by deep learning disentanglement
Nature Communications (2023)
-
Dissecting gene regulation with multimodal sequencing
Nature Methods (2023)
-
The technological landscape and applications of single-cell multi-omics
Nature Reviews Molecular Cell Biology (2023)
-
Best practices for single-cell analysis across modalities
Nature Reviews Genetics (2023)