Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Multimodal learning with graphs

Abstract

Artificial intelligence for graphs has achieved remarkable success in modelling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases — assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets is challenging because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, graph artificial intelligence methods combine different modalities while leveraging cross-modal dependencies through geometric relationships. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph learning, use it to study existing methods and provide guidelines to design new models.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Graph-centric multimodal learning.
Fig. 2: Overview of MGL blueprint.
Fig. 3: Applications of MGL blueprint to images.
Fig. 4: Applications of MGL blueprint to language datasets.
Fig. 5: Applications of MGL to natural sciences.

Similar content being viewed by others

Data availability

We summarize MGL methods and provide a continually updated summary at https://yashaektefaie.github.io/mgl. We host a live table where MGL methods are added to provide an evolving resource for the community.

References

  1. Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23, 40–55 (2022).

    Article  Google Scholar 

  2. Yu, M. K. et al. Visible machine learning for biomedicine. Cell 173, 1562–1565 (2018).

    Article  Google Scholar 

  3. Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2017).

    Article  Google Scholar 

  4. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 70 (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).

  5. Sanchez-Gonzalez, A. et al. Graph networks as learnable physics engines for inference and control. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 4470–4479 (PMLR, 2018).

  6. Sanchez-Gonzalez, A. et al. Learning to simulate complex physics with graph networks. In Proc. 37th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 119 (eds Daumé, H. III & Singh, A.) 8459–8468 (PMLR, 2020).

  7. Liu, Q., Kusner, M. J. & Blunsom, P. A survey on contextual embeddings. Preprint at https://arxiv.org/abs/2003.07278 (2020).

  8. Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2009).

    Article  Google Scholar 

  9. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (2017).

  10. Kipf, T. N. & Welling, M. Variational graph auto-encoders. In NIPS Workshop on Bayesian Deep Learning (2016).

  11. Grover, A., Zweig, A. & Ermon, S. Graphite: iterative generative modeling of graphs. In Proc. 36th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 972 (eds Chaudhuri, K. & Salakhutdinov, R.) 434–2444 (PMLR, 2019).

  12. Guo, X. & Zhao, L. A systematic survey on deep generative models for graph generation. Preprint at https://arxiv.org/abs/2007.06686 (2020).

  13. Baltrusaitis, T., Ahuja, C. & Morency. L-P. Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2019).

  14. Hong, C., Yu, J., Wan, J., Tao, D. & Wang, M. Multimodal deep autoencoder for human pose recovery. IEEE Trans. Image Process. 24, 5659–5670 (2015).

    Article  MathSciNet  MATH  Google Scholar 

  15. Khattar, D., Goud, J. S., Gupta, M. & Varma, V. MVAE: multimodal variational autoencoder for fake news detection. In The World Wide Web Conference 2915–2921 (Association for Computing Machinery, 2019).

  16. Mao, J., Xu, J., Jing, Y. & Yuille, A. Training and evaluating multimodal word embeddings with large-scale web annotated images. In Proc. 30th International Conference on Neural Information Processing Systems 442–450 (Curran Associates, 2016).

  17. Huang, Y., Lin, J., Zhou, C., Yang, H. & Huang, L. Modality competition: what makes joint training of multi-modal network fail in deep learning? (Provably). In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9226–9259 (PMLR, 2022).

  18. Xu, P., Zhu, X. & Clifton, D. A. Multimodal learning with transformers: a survey. Preprint at https://arxiv.org/abs/2206.06488 (2022).

  19. Bayoudh, K., Knani, R., Hamdaoui, F. & Mtibaa, A. A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Vis.Comput. 38, 2939–2970 (2022).

    Article  Google Scholar 

  20. Zhang, C., Yang, Z., He, X. & Deng, L. Multimodal intelligence: representation learning, information fusion, and applications. IEEE J. Sel. Top. Signal Process. 14, 478–493 (2020).

    Article  Google Scholar 

  21. Javaloy, A., Meghdadi, M. & Valera, I. Mitigating modality collapse in multimodal VAEs via impartial optimization. In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9938–9964 (PMLR, 2022).

  22. Ma, M. et al. SMIL: multimodal learning with severely missing modality. Proc. AAAI Conf. Artif. Intell. 35, 2302–2310 (2021).

    Google Scholar 

  23. Poklukar, P. et al. Geometric multimodal contrastive representation learning. in Proc. Mach. Learn. Res. 162, 17782–17800 (2022).

  24. Zitnik, M. et al. Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inf. Fusion 50, 71–91 (2019).

    Article  Google Scholar 

  25. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  26. Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34 (2021).

  27. Walters, W. P. & Barzilay, R. Applications of deep learning in molecule generation and molecular property prediction. Acc. Chem. Res. 54, 263–270 (2021).

    Article  Google Scholar 

  28. Wang, J., Hu, J., Qian, S., Fang, Q. & Xu, C. Multimodal graph convolutional networks for high quality content recognition. Neurocomputing 412, 42–51 (2020).

    Article  Google Scholar 

  29. Mai, S., Hu, H. & Xing, S. Modality to modality translation: an adversarial representation learning and graph fusion network for multimodal fusion. Proc. AAAI Conf. Artif. Intell. 34, 164–172 (2020).

    Google Scholar 

  30. Zhang, X., Zeman, M., Tsiligkaridis, T. & Zitnik, M. Graph-guided network for irregularly sampled multivariate time series. In International Conference on Learning Representations (2022).

  31. Zhao, F. & Wang, D. Multimodal Graph Meta Contrastive Learning 3657–3661 (Association for Computing Machinery, 2021).

  32. Zheng, S. et al., Multi-modal graph learning for disease prediction. in IEEE Trans. Med. Imaging 41, 2207–2216 (2022).

  33. Ramachandram, D. & Taylor, G. W. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Process. Mag. 34, 96–108 (2017).

    Article  Google Scholar 

  34. Ngiam, J. et al. Multimodal deep learning. In Proc. 28th International Conference on International Conference on Machine Learning 689–696 (Omnipress, 2011).

  35. Aafaq, N., Akhtar, N., Liu, W., Gilani, S. Z. & Mian, A. Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12487–12496 (IEEE, 2019).

  36. Fang, Z., Gokhale, T., Banerjee, P., Baral, C. & Yang, Y. Video2Commonsense: generating commonsense descriptions to enrich video captioning. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 840–860 (Association for Computational Linguistics, 2020).

  37. Kiros, R., Salakhutdinov, R. & Zemel, R. Multimodal neural language models. In Proc. 31st International Conference on Machine Learning: Proc. Machine Learning Research Vol. 32 (eds Xing, E. P. & Jebara, T.) 595–603 (PMLR, 2014).

  38. Rezaei-Shoshtari, S., Hogan, F. R., Jenkin, M., Meger, D. & Dudek, G. Learning intuitive physics with multimodal generative models. Proc. AAAI Conf. Artif. Intell. 35, 6110–6118 (2021).

    Google Scholar 

  39. Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://arxiv.org/abs/2104.13478 (2021).

  40. Chen, Y. et al. Graph-based global reasoning networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 433–442 (IEEE, 2019).

  41. Varga, V. & Lorincz, A. Fast interactive video object segmentation with graph neural networks. In International Joint Conference on Neural Networks 1–10 (IEEE, 2021).

  42. Liu, Q., Kampffmeyer, M., Jenssen, R. & Salberg, A.-B. Self-constructing graph neural networks to model long-range pixel dependencies for semantic segmentation of remote sensing images. Int. J. Remote Sensing 42, 6184–6208 (2021).

    Article  Google Scholar 

  43. Zhou, S., Zhang, J., Zuo, W. & Loy, C. C. Cross-scale internal graph neural network for image super-resolution. Adv. Neural Inf. Process. Syst. 33, 3499–3509 (2020).

  44. Mou, C. & Zhang, J. Graph attention neural network for image restoration. In 2021 IEEE International Conference on Multimedia and Expo 1–6 (IEEE, 2021).

  45. Qi, S., Wang, W., Jia, B., Shen, J. & Zhu, S. C. Learning human-object interactions by graph parsing neural networks. In Computer Vision – EECV 2018 407–423 (Springer, 2018).

  46. Wang, H., Zheng, W.-s. & Yingbiao, L. Contextual heterogeneous graph network for human–object interaction detection. In Computer Vision—ECCV 2020: Proc. 16th European Conference Part XVII 248–264 (Springer-Verlag, 2020).

  47. Avelar, P. C., Tavares, A. R., da Silveira, T. T., Jung, C. R. & Lamb, L. C. Superpixel image classification with graph attention networks. In 33rd SIBGRAPI Conference on Graphics, Patterns and Images 203–209 (IEEE Computer Society, 2020).

  48. Lu, Y., Chen, Y., Zhao, D. & Chen, J. in Advances in Neural Networks: Lecture Notes in Computer Science Vol. 11554 (eds Lu, H. et al.) 97–105 (Springer, 2019).

  49. Kim, J., Lee, J. K. & Lee, K. M. Deeply-recursive convolutional network for image super-resolution. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1637–1645 (IEEE, 2016).

  50. Achanta, R. et al. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012).

    Article  Google Scholar 

  51. Zeng, H., Liu, Q., Zhang, M., Han, X. & Wang, Y. Semi-supervised hyperspectral image classification with graph clustering convolutional networks. Preprint at https://arxiv.org/abs/2012.10932 (2020).

  52. Wan, S. et al. Multiscale dynamic graph convolutional network for hyperspectral image classification. IEEE Trans. Geosc. Remote Sensing 58, 3162–3177 (2019).

    Article  Google Scholar 

  53. Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 3431–3440 (IEEE, 2015).

  54. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) (2015).

  55. Knyazev, B., Lin, X., Amer, M. R. & Taylor, G. W. Image classification with hierarchical multigraph networks. In British Machine Vision Conference (2019).

  56. Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (2018).

  57. alsesia, D., Fracastoro, G. & Magli, E. Deep graph-convolutional image denoising. In IEEE Trans. Image Process. 8226–8237 (2020).

  58. Bresson, X. & Laurent, T. Residual gated graph ConvNets. Preprint at https://arxiv.org/abs/1711.07553 (2017).

  59. Biten, A. F. et al. Scene text visual question answering. In Proc. IEEE/CVF International Conference on Computer Vision 4291–4301 (2019).

  60. Singh, A. et al. Towards VQA models that can read. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 8317–8326 (IEEE, 2019).

  61. Liu, C. et al. Graph structured network for image-text matching. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10921–10930 (IEEE, 2020).

  62. Zhang, F. Z., Campbell, D. & Gould, S. Spatially conditioned graphs for detecting human–object interactions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13319–13327 (IEEE, 2021).

  63. Ulutan, O., Iftekhar, A. S. M. & Manjunath, B. S. VSGNet: spatial attention network for detecting human object interactions using graph convolutions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13617–13626 (IEEE, 2020).

  64. Gao, C., Xu, J., Zou, Y. & Huang, J.-B. DRG: Dual relation graph for human–object interaction detection. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 696–712 (Springer, 2020).

  65. Zhou, P. & Chi, M. Relation parsing neural network for human–object interaction detection. In Proc. IEEE/CVF International Conference on Computer Vision 843–851 (IEEE, 2019).

  66. Gao, D., Li, K., Wang, R., Shan, S. & Chen, X. Multi-modal graph neural network for joint reasoning on vision and scene text. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12746–12756 (IEEE, 2020).

  67. Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016).

    Article  Google Scholar 

  68. Wu, T. et al. GINet: graph interaction network for scene parsing. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 34–51 (Springer, 2020).

  69. Wu, L. et al. Graph neural networks for natural language processing: a survey. Preprint at https://arxiv.org/abs/2106.06090 (2021).

  70. Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

  71. Li, I., Li, T., Li, Y., Dong, R. & Suzumura, T. Heterogeneous graph neural networks for multi-label text classification. Preprint at https://arxiv.org/abs/2103.14620 (2021).

  72. Huang, L., Ma, D., Li, S., Zhang, X. & Wang, H. Text level graph neural network for text classification. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 3444–3450 (Association for Computational Linguistics, 2019).

  73. Zhang, Y. et al. Every document owns its structure: inductive text classification via graph neural networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 334–339 (Association for Computational Linguistics, 2020).

  74. Pan, J., Peng, M. & Zhang, Y. Mention-centered graph neural network for document-level relation extraction. Preprint at https://arxiv.org/abs/2103.08200 (2021).

  75. Zhu, H. et al. Graph neural networks with generated parameters for relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 1331–1339 (Association for Computational Linguistics, 2019).

  76. Guo, Z., Zhang, Y. & Lu, W. Attention guided graph convolutional networks for relation extraction. In Proc. 57th Annual Meeting of the Association for Computational Linguistics 241–251 (Association for Computational Linguistics, 2019).

  77. Zeng, S., Xu, R., Chang, B. & Li, L. Double graph based reasoning for document-level relation extraction. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 1630–1640 (Association for Computational Linguistics, 2020).

  78. Chen, X. et al. Aspect sentiment classification with document-level sentiment preference modeling. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 3667–3677 (Association for Computational Linguistics, 2020).

  79. Zhang, C., Li, Q. & Song, D. Aspect-based sentiment classification with aspect-specific graph convolutional networks. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 4568–4578 (Association for Computational Linguistics, 2019).

  80. Zhang, M. & Qian, T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 3540–3549 (Association for Computational Linguistics, 2020).

  81. Pouran Ben Veyseh, A. et al. Improving aspect-based sentiment analysis with gated graph convolutional networks and syntax-based regulation. In Findings of the Association for Computational Linguistics: EMNLP 2020 4543–4548 (Association for Computational Linguistics, 2020).

  82. Shlomi, J., Battaglia, P. & Vlimant, J.-R. Graph neural networks in particle physics. Mach. Learn. Sci. Technol. 2, 021001 (2021).

    Article  Google Scholar 

  83. Henrion, I. et al. Neural message passing for jet physics. In Deep Learning for Physical Sciences Workshop at the 31st Conference on Neural Information Processing Systems (2017).

  84. Qasim, S. R., Kieseler, J., Iiyama, Y. & Pierini, M. Learning representations of irregular particle-detector geometry with distance-weighted graph networks. Eur. Phys. J. C 79, 608 (2019).

  85. Mikuni, V. & Canelli, F. ABCnet: an attention-based method for particle tagging. Eur. Phys. J. Plus 135, 463 (2020).

  86. Ju, X. et al. Graph neural networks for particle reconstruction in high energy physics detectors. Preprint at https://arxiv.org/abs/2003.11603 (2020).

  87. Shukla, K., Xu, M., Trask, N. & Karniadakis, G. E. Scalable algorithms for physics-informed neural and graph networks. Data Centric Eng. 3, e24 (2022).

    Article  Google Scholar 

  88. Seo, S. & Liu, Y. Differentiable physics-informed graph networks. Preprint at https://arxiv.org/abs/1902.02950 (2019).

  89. Li, W. & Deka, D. Physics based GNNs for locating faults in power grids. Preprint at https://arxiv.org/abs/2107.02275 (2021).

  90. Battaglia, P. W. et al. Relational inductive biases, deep learning, and graph networks. Preprint at https://arxiv.org/abs/1806.01261 (2018).

  91. Veličković, P., Ying, R., Padovano, M., Hadsell, R. & Blundell, C. Neural execution of graph algorithms. In International Conference on Learning Representations (2020).

  92. Schuetz, M. J. A., Brubaker, J. K. & Katzgraber, H. G. Combinatorial optimization with physics-inspired graph neural networks. Nat. Mach. Intell. 4, 367–377 (2022).

    Article  Google Scholar 

  93. Mirhoseini, A. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021).

    Article  Google Scholar 

  94. Gasteiger, J., Gross, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (2020).

  95. Jørgensen, P. B., Jacobsen, K. W. & Schmidt, M. N. Neural message passing with edge updates for predicting properties of molecules and materials. Preprint at https://arxiv.org/abs/1806.03146 (2018).

  96. Gasteiger, J., Yeshwanth, C. & Günnemann, S. Directional message passing on molecular graphs via synthetic coordinates. Adv. Neural Inf. Process. Syst. 34, 15421–15433 (2021).

  97. Liu, M. et al. Fast quantum property prediction via deeper 2D and 3D graph networks. AI for Science Workshop (NeurIPS, 2021).

  98. St. John, P. C., Guan, Y., Kim, Y., Kim, S. & Paton, R. S. Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost. Nat. Commun. 11, 2328 (2020).

    Article  Google Scholar 

  99. Pattanaik, L. et al. Message passing networks for molecules with tetrahedral chirality. Preprint at https://arxiv.org/abs/2012.00094 (2020).

  100. Fey, M., Yuen, J.-G. & Weichert, F. Hierarchical inter-message passing for learning on molecular graphs. Preprint at https://arxiv.org/abs/2006.12179 (2020).

  101. Ariëns, E. Chirality in bioactive agents and its pitfalls. Trends Pharmacol. Sci. 7, 200–205 (1986).

    Article  Google Scholar 

  102. Guan, Y. et al. Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem. Sci. 12, 2198–2208 (2021).

    Article  Google Scholar 

  103. Coley, C. W. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).

    Article  Google Scholar 

  104. Struble, T. J., Coley, C. W. & Jensen, K. F. Multitask prediction of site selectivity in aromatic C–H functionalization reactions. React. Chem. Eng. 5, 896–902 (2020).

    Article  Google Scholar 

  105. Stuyver, T. & Coley, C. W. Quantum chemistry-augmented neural networks for reactivity prediction: performance, generalizability, and explainability. J. Chem. Phys. 156, 084104 (2022).

    Article  Google Scholar 

  106. Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).

    Article  Google Scholar 

  107. Fu, T. et al. Differentiable scaffolding tree for molecule optimization. In International Conference on Learning Representations (2022).

  108. Mercado, R. et al. Graph networks for molecular design. Mach. Learn. Sci. Technol. 2, 025023 (2021).

    Article  Google Scholar 

  109. Torng, W. & Altman, R. B. Graph convolutional neural networks for predicting drug–target interactions. J. Chem. Inf. Model. 59, 4131–4149 (2019).

    Article  Google Scholar 

  110. Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGMnet: a physics-informed deep learning model toward generalized drug-target interaction predictions. Chem. Sci. 13, 3661–3673 (2022).

    Article  Google Scholar 

  111. Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).

    Article  Google Scholar 

  112. Sanner, M. F., Olson, A. J. & Spehner, J.-C. Reduced surface: an efficient way to compute molecular surfaces. Biopolymers 38, 305–320 (1996).

    Article  Google Scholar 

  113. Sverrisson, F., Feydy, J., Correia, B. E. & Bronstein, M. M. Fast end-to-end learning on protein surfaces. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 15272–15281 (IEEE, 2021).

  114. Feng, Y., You, H., Zhang, Z., Ji, R. & Gao, Y. Hypergraph neural networks. Proc. AAAI Conf. Artif. Intell. 33, 3558–3565 (2019).

    Google Scholar 

  115. Srinivasan, B., Zheng, D. & Karypis, G. Learning over Families of Sets—Hypergraph Representation Learning for Higher Order Tasks 756–764 (SIAM Activity Group on Data Science, 2021).

  116. Jo, J. et al. Edge representation learning with hypergraphs. Adv. Neural Inf. Process. Syst. 34, 7534–7546 (2021).

  117. Zhang, C., Song, D., Huang, C., Swami, A. & Chawla, N. V. Heterogeneous graph neural network. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 793–803 (Association for Computing Machinery, 2019).

  118. Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci. Data 10, 67 (2023).

  119. Lee, S. & Song, B. C. Graph-based knowledge distillation by multi-head attention network. In Proc. British Machine Vision Conference (eds Sidorov, K. & Hicks, Y.) 162.1–162.12 (BMVA, 2019).

  120. Zhou, S. et al. Distilling holistic knowledge with graph neural networks. In Proc. IEEE/CVF International Conference on Computer Vision 10387–10396 (IEEE, 2021).

  121. Sun, L., Gou, J., Yu, B., Du, L. & Tao, D. Collaborative teacher–student learning via multiple knowledge transfer. Preprint at https://arxiv.org/abs/2101.08471 (2021).

  122. Park, W., Kim, D., Lu, Y. & Cho, M. Relational knowledge distillation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 3967–3976 (IEEE, 2019).

  123. Liu, Y. et al. Knowledge distillation via instance relationship graph. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 7096–7104 (IEEE, 2019).

  124. Ma, J. et al. Using deep learning to model the hierarchical structure and function of a cell. Nat. Methods 15, 290–298 (2018).

    Article  Google Scholar 

  125. Borisov, V. et al. Deep neural networks and tabular data: a survey. in IEEE Transactions on Neural Networks and Learning Systems https://doi.org/10.1109/TNNLS.2022.3229161 (2022).

  126. Jiang, D. et al. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J. Cheminform. 13, 12 (2021).

    Article  Google Scholar 

  127. Nicholson, D. N. & Greene, C. S. Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotechnol. J. 18, 1414–1428 (2020).

    Article  Google Scholar 

  128. Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations (2019).

  129. Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 30, 1024–1034 (2017).

  130. Xu, K. et al. Representation learning on graphs with jumping knowledge networks. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 5453–5462 (PMLR, 2018).

Download references

Acknowledgements

Y.E., G.D. and M.Z. gratefully acknowledge the support of US Air Force Contract No. FA8702-15-D-0001, and awards from Harvard Data Science Initiative, Amazon Research, Bayer Early Excellence in Science, AstraZeneca Research and Roche Alliance with Distinguished Scientists. Y.E. is supported by grant T32 HG002295 from the National Human Genome Research Institute and the NSDEG fellowship. G.D. is supported by the Harvard Data Science Initiative Postdoctoral Fellowship. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marinka Zitnik.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Jianzhu Ma, Ying Ding and Shuiwang Ji for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ektefaie, Y., Dasoulas, G., Noori, A. et al. Multimodal learning with graphs. Nat Mach Intell 5, 340–350 (2023). https://doi.org/10.1038/s42256-023-00624-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-023-00624-6

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics