Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data

A preprint version of the article is available at bioRxiv.

Abstract

The phenotypes of complex biological systems are fundamentally driven by various multi-scale mechanisms. Multi-modal data, such as single-cell multi-omics data, enable a deeper understanding of underlying complex mechanisms across scales for phenotypes. We have developed an interpretable regularized learning model, deepManReg, to predict phenotypes from multi-modal data. First, deepManReg employs deep neural networks to learn cross-modal manifolds and then to align multi-modal features onto a common latent space. Second, deepManReg uses cross-modal manifolds as a feature graph to regularize the classifiers for improving phenotype predictions and also for prioritizing the multi-modal features and cross-modal interactions for the phenotypes. We apply deepManReg to (1) an image dataset of handwritten digits with multi-features and (2) single-cell multi-modal data (Patch-seq data) including transcriptomics and electrophysiology for neuronal cells in the mouse brain. We show that deepManReg improves phenotype prediction in both datasets, and also prioritizes genes and electrophysiological features for the phenotypes of neuronal cells.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: DeepManReg: a deep manifold-regularized learning model for improving phenotype prediction from multi-modal data.
Fig. 2: Multi-modal feature alignment of handwritten digits.
Fig. 3: Regularized classification results for the mfeat digits dataset.
Fig. 4: The network showing the relationships across two modalities (genes and electrophysiology).
Fig. 5: Regularized classification results for single-cell multi-modal data in the mouse visual cortex.

Similar content being viewed by others

Data availability

The multiple-features (mfeat) dataset is available from ref. 11. The Patch-seq transcriptomics data and electrophysiological data are available from ref. 12. The simulated multi-omics data and gene regulatory network (that is, the example model data of dyngen for five genes) are available from ref. 16. Source data are provided with this paper.

Code availability

Code for deepManReg implementation and data analysis are available at https://github.com/daifengwanglab/deepManReg. An interactive version of the code base is provided in ref. 38.

References

  1. Larranaga, P. et al. Machine learning in bioinformatics. Brief Bioinformatics 7, 86–112 (2006).

    Article  Google Scholar 

  2. Subramanian, I., Verma, S., Kumar, S., Jere, A. & Anamika, K. Multi-omics data integration, interpretation and its application. Bioinform. Biol. Insights 14, 1177932219899051 (2020).

    Article  Google Scholar 

  3. Sima, C. et al. Impact of error estimation on feature selection. Pattern Recogn. 38, 2472–2482 (2005).

    Article  Google Scholar 

  4. Wang, C. & Mahadevan, S. A general framework for manifold alignment. In AAAI Fall Symposium: Manifold Learning and Its Applications 79–86 (AAAI, 2009).

  5. Nguyen, N. D., Blaby, I. K. & Wang, D. ManiNetCluster: a novel manifold learning approach to reveal the functional links between gene networks. BMC Genomics 20, 1003 (2019).

    Article  Google Scholar 

  6. Nguyen, N. D. & Wang, D. Multiview learning for understanding functional multiomics. PLoS Comput. Biol. 16, e1007677 (2020).

    Article  Google Scholar 

  7. Brorson, I. S. et al. No differential gene expression for CD4+ T cells of MS patients and healthy controls. Mult. Scler. J. Exp. Transl. Clin. 5, 2055217319856903 (2019).

    Google Scholar 

  8. Ng, A. Y. Feature selection, L1 vs. L2 regularization and rotational invariance. In Proc. 21st International Conference on Machine Learning (eds Greiner, R. & Schuurmans, D.) 78 (ACM Press, 2004).

  9. Li, C. & Li, H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24, 1175–1182 (2008).

    Article  Google Scholar 

  10. Sandler, T., Blitzer, J., Talukdar, P. & Ungar, L. Regularized learning with networks of features. Adv. Neural Inf. Process. Syst. 21, 1401–1408 (2008).

    Google Scholar 

  11. van Breukelen, M., Duin, R. P. W., Tax, D. M. J. & Den Hartog, J. E. Handwritten digit recognition by combined classifiers. Kybernetika 34, 381–386 (1998).

    MATH  Google Scholar 

  12. Gouwens, N. W. et al. Integrated morphoelectric and transcriptomic classification of cortical gabaergic cells. Cell 183, 935–953 (2020).

    Article  Google Scholar 

  13. Wang, C. & Mahadevan, S. Manifold alignment without correspondence. In Proc. 21st International Joint Conference on Artificial Intelligence (ed. Boutilier, C.) 1273–1278 (ACM, 2009).

  14. Hotelling, H. in Breakthroughs in Statistics 162–190 (Springer, 1992).

  15. Welch, J. D., Hartemink, A. J. & Prins, J. F. MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics. Genome Biol. 18, 138 (2017).

    Article  Google Scholar 

  16. Cannoodt, R., Saelens, W., Deconinck, L. & Saeys, Y. Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells. Nat. Commun. 12, 3942 (2021).

    Article  Google Scholar 

  17. Cadwell, C. R. et al. Multimodal profiling of single-cell morphology, electrophysiology and gene expression using Patch-seq. Nat. Protoc. 12, 2531–2553 (2017).

    Article  Google Scholar 

  18. Intrinsic Physiology Feature Extractor (IPFX) Python package (Allen Institute, 2021); https://ipfx.readthedocs.io/

  19. Santos, M. S., Soares, J. P., Abreu, P. H., Araujo, H. & Santos, J. Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches [research frontier]. IEEE Comput. Intell. Mag. 13, 59–76 (2018).

    Article  Google Scholar 

  20. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3319–3328 (PMLR, 2017).

  21. Nguyen, N. D., Jin, T. & Wang, D. Varmole: a biologically drop-connect deep neural network model for prioritizing disease risk variants and genes. Bioinformatics 37, 1772–1775 (2021).

    Article  Google Scholar 

  22. Kokhlikyanet, N. et al. Captum: a unified and generic model interpretability library for PyTorch. CoRR abs/2009.07896 (2020).

  23. Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2008).

    Article  Google Scholar 

  24. Cunningham, J. P. & Ghahramani, Z. Linear dimensionality reduction: survey, insights and generalizations. J. Mach. Learn. Res. 16, 2859–2900 (2015).

    MathSciNet  MATH  Google Scholar 

  25. Boumal, N., Mishra, B., Absil, P.-A. & Sepulchre, R. Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15, 1455–1459 (2014).

    MATH  Google Scholar 

  26. Sato, H. & Aihara, K. Cholesky QR-based retraction on the generalized Stiefel manifold. Comput. Opt. Appl. 72, 293–308 (2019).

    Article  MathSciNet  Google Scholar 

  27. Fowlkes, C., Belongie, S., Chung, F. & Malik, J. Spectral grouping using the nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26, 214–225 (2004).

    Article  Google Scholar 

  28. Belkin, M., Niyogi, P. & Sindhwani, V. On manifold regularization. In Proc. Tenth International Workshop on Artificial Intelligence and Statistics (eds Cowell, R. G. & Ghahramani, Z.) R5, 17–24 (PMLR, 2005).

  29. Ando, R. K. & Zhang, T. Learning on graph with Laplacian regularization. Adv. Neural Inf. Process. Syst. 19, 25–32 (2007).

    Google Scholar 

  30. Singh Tomar, V. & Rose, R. C. Manifold regularized deep neural networks. In Proc. 15th Annual Conference of the International Speech Communication Association (eds Li, H. et al.) 348–352 (ISCA, 2014).

  31. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR, 2017).

  32. Liu, J., Huang, Y., Singh, R., Vert, J.-P. & Noble, W. S. Jointly embedding multiple single-cell omics measurements. In 19th International Workshop on Algorithms in Bioinformatics (eds Huber, K. T. & Gusfield, D.) 10:1–10:13 (WABI, 2019).

  33. Vu, H., Carey, C. & Mahadevan, S. Manifold warping: manifold alignment over time. In Proc. AAAI Conference on Artificial Intelligence Vol. 26 (eds Hoffmann, J. & Selman, B.) 1155–1161 (AAAI, 2012).

  34. Wang, C., Krafft, P., Mahadevan, S., Ma, Y. & Fu, Y. Manifold alignment. In Manifold Learning: Theory and Applications 95–120 (CRC, 2011).

  35. Belkin, M. & Niyogi, P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003).

    Article  Google Scholar 

  36. Stiefel, E. Richtungsfelder und Fernparallelismus in n-dimensionalen Mannigfaltigkeiten. Commentarii Math. Helvetici 8, 305–353 (1935).

    Article  MathSciNet  Google Scholar 

  37. Paszke, A. et al. Automatic differentiation in PyTorch. In 31st Conference on Neural Information Processing Systems (NIPS) (Workshop on Autodiff, 2017).

  38. Nguyen, N. D., Huang, J. & Wang, D. deepManReg: a deep manifold-regularized learning model for improving phenotype prediction from multi-modal data [source code] (CodeOcean, 2021); https://doi.org/10.24433/co.1706111.v1

Download references

Acknowledgements

We thank K. Huynh (Stony Brook University) for useful discussions. This work was supported by National Institutes of Health grants nos. R01AG067025, R21CA237955 and U01MH116492 to D.W. and U54HD090256 to Waisman Center. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

D.W. and N.D.N. conceptualized the study. D.W. and N.D.N. designed the algorithm and methodology. N.D.N. and J.H. implemented software. D.W., N.D.N. and J.H. performed analysis. D.W., N.D.N. and J.H. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Daifeng Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks James J. Cai, Bamdev Mishra and Daniel Osorio for their contribution to the peer review of this work. Handling editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Materials

Supplementary Figs. 1–6, Algorithm and Table 1.

Peer review information

Supplementary Data 1

Prioritized top 20 genes and top 5 electrophysiological features of cell layers in the mouse visual cortex by feature importance scores of deepManReg.

Source data

Source Data Fig. 2

Numerical data for scatter plots in Fig. 2

Source Data Fig. 3

Numerical data for the plots in Fig. 3; source data for panels a, b and c are in separate folders.

Source Data Fig. 5

Numerical data for the plots in Fig. 5; source data for panels a, b and c are in separate folders.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, N.D., Huang, J. & Wang, D. A deep manifold-regularized learning model for improving phenotype prediction from multi-modal data. Nat Comput Sci 2, 38–46 (2022). https://doi.org/10.1038/s43588-021-00185-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-021-00185-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing