Multimodal learning with graphs

Ektefaie, Yasha; Dasoulas, George; Noori, Ayush; Farhat, Maha; Zitnik, Marinka

doi:10.1038/s42256-023-00624-6

Perspective
Published: 03 April 2023

Multimodal learning with graphs

Yasha Ektefaie^1,2^na1,
George Dasoulas^2,3^na1,
Ayush Noori^2,4,
Maha Farhat^2,5 &
…
Marinka Zitnik ORCID: orcid.org/0000-0001-8530-7228^2,3,6

Nature Machine Intelligence volume 5, pages 340–350 (2023)Cite this article

10k Accesses
17 Citations
57 Altmetric
Metrics details

Subjects

Abstract

Artificial intelligence for graphs has achieved remarkable success in modelling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases — assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets is challenging because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, graph artificial intelligence methods combine different modalities while leveraging cross-modal dependencies through geometric relationships. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph learning, use it to study existing methods and provide guidelines to design new models.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Graph-centric multimodal learning.**

**Fig. 3: Applications of MGL blueprint to images.**

**Fig. 4: Applications of MGL blueprint to language datasets.**

**Fig. 5: Applications of MGL to natural sciences.**

Graph neural networks

Article 07 March 2024

GRAPE for fast and scalable graph processing and random-walk-based embedding

Article Open access 26 June 2023

Explainable artificial intelligence through graph theory by generalized social network analysis-based classifier

Article Open access 08 September 2022

Data availability

We summarize MGL methods and provide a continually updated summary at https://yashaektefaie.github.io/mgl. We host a live table where MGL methods are added to provide an evolving resource for the community.

References

Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23, 40–55 (2022).
Article Google Scholar
Yu, M. K. et al. Visible machine learning for biomedicine. Cell 173, 1562–1565 (2018).
Article Google Scholar
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2017).
Article Google Scholar
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 70 (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Sanchez-Gonzalez, A. et al. Graph networks as learnable physics engines for inference and control. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 4470–4479 (PMLR, 2018).
Sanchez-Gonzalez, A. et al. Learning to simulate complex physics with graph networks. In Proc. 37th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 119 (eds Daumé, H. III & Singh, A.) 8459–8468 (PMLR, 2020).
Liu, Q., Kusner, M. J. & Blunsom, P. A survey on contextual embeddings. Preprint at https://arxiv.org/abs/2003.07278 (2020).
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2009).
Article Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (2017).
Kipf, T. N. & Welling, M. Variational graph auto-encoders. In NIPS Workshop on Bayesian Deep Learning (2016).
Grover, A., Zweig, A. & Ermon, S. Graphite: iterative generative modeling of graphs. In Proc. 36th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 972 (eds Chaudhuri, K. & Salakhutdinov, R.) 434–2444 (PMLR, 2019).
Guo, X. & Zhao, L. A systematic survey on deep generative models for graph generation. Preprint at https://arxiv.org/abs/2007.06686 (2020).
Baltrusaitis, T., Ahuja, C. & Morency. L-P. Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2019).
Hong, C., Yu, J., Wan, J., Tao, D. & Wang, M. Multimodal deep autoencoder for human pose recovery. IEEE Trans. Image Process. 24, 5659–5670 (2015).
Article MathSciNet MATH Google Scholar
Khattar, D., Goud, J. S., Gupta, M. & Varma, V. MVAE: multimodal variational autoencoder for fake news detection. In The World Wide Web Conference 2915–2921 (Association for Computing Machinery, 2019).
Mao, J., Xu, J., Jing, Y. & Yuille, A. Training and evaluating multimodal word embeddings with large-scale web annotated images. In Proc. 30th International Conference on Neural Information Processing Systems 442–450 (Curran Associates, 2016).
Huang, Y., Lin, J., Zhou, C., Yang, H. & Huang, L. Modality competition: what makes joint training of multi-modal network fail in deep learning? (Provably). In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9226–9259 (PMLR, 2022).
Xu, P., Zhu, X. & Clifton, D. A. Multimodal learning with transformers: a survey. Preprint at https://arxiv.org/abs/2206.06488 (2022).
Bayoudh, K., Knani, R., Hamdaoui, F. & Mtibaa, A. A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Vis.Comput. 38, 2939–2970 (2022).
Article Google Scholar
Zhang, C., Yang, Z., He, X. & Deng, L. Multimodal intelligence: representation learning, information fusion, and applications. IEEE J. Sel. Top. Signal Process. 14, 478–493 (2020).
Article Google Scholar
Javaloy, A., Meghdadi, M. & Valera, I. Mitigating modality collapse in multimodal VAEs via impartial optimization. In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9938–9964 (PMLR, 2022).
Ma, M. et al. SMIL: multimodal learning with severely missing modality. Proc. AAAI Conf. Artif. Intell. 35, 2302–2310 (2021).
Google Scholar
Poklukar, P. et al. Geometric multimodal contrastive representation learning. in Proc. Mach. Learn. Res. 162, 17782–17800 (2022).
Zitnik, M. et al. Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inf. Fusion 50, 71–91 (2019).
Article Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article Google Scholar
Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34 (2021).
Walters, W. P. & Barzilay, R. Applications of deep learning in molecule generation and molecular property prediction. Acc. Chem. Res. 54, 263–270 (2021).
Article Google Scholar
Wang, J., Hu, J., Qian, S., Fang, Q. & Xu, C. Multimodal graph convolutional networks for high quality content recognition. Neurocomputing 412, 42–51 (2020).
Article Google Scholar
Mai, S., Hu, H. & Xing, S. Modality to modality translation: an adversarial representation learning and graph fusion network for multimodal fusion. Proc. AAAI Conf. Artif. Intell. 34, 164–172 (2020).
Google Scholar
Zhang, X., Zeman, M., Tsiligkaridis, T. & Zitnik, M. Graph-guided network for irregularly sampled multivariate time series. In International Conference on Learning Representations (2022).
Zhao, F. & Wang, D. Multimodal Graph Meta Contrastive Learning 3657–3661 (Association for Computing Machinery, 2021).
Zheng, S. et al., Multi-modal graph learning for disease prediction. in IEEE Trans. Med. Imaging 41, 2207–2216 (2022).
Ramachandram, D. & Taylor, G. W. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Process. Mag. 34, 96–108 (2017).
Article Google Scholar
Ngiam, J. et al. Multimodal deep learning. In Proc. 28th International Conference on International Conference on Machine Learning 689–696 (Omnipress, 2011).
Aafaq, N., Akhtar, N., Liu, W., Gilani, S. Z. & Mian, A. Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12487–12496 (IEEE, 2019).
Fang, Z., Gokhale, T., Banerjee, P., Baral, C. & Yang, Y. Video2Commonsense: generating commonsense descriptions to enrich video captioning. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 840–860 (Association for Computational Linguistics, 2020).
Kiros, R., Salakhutdinov, R. & Zemel, R. Multimodal neural language models. In Proc. 31st International Conference on Machine Learning: Proc. Machine Learning Research Vol. 32 (eds Xing, E. P. & Jebara, T.) 595–603 (PMLR, 2014).
Rezaei-Shoshtari, S., Hogan, F. R., Jenkin, M., Meger, D. & Dudek, G. Learning intuitive physics with multimodal generative models. Proc. AAAI Conf. Artif. Intell. 35, 6110–6118 (2021).
Google Scholar
Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://arxiv.org/abs/2104.13478 (2021).
Chen, Y. et al. Graph-based global reasoning networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 433–442 (IEEE, 2019).
Varga, V. & Lorincz, A. Fast interactive video object segmentation with graph neural networks. In International Joint Conference on Neural Networks 1–10 (IEEE, 2021).
Liu, Q., Kampffmeyer, M., Jenssen, R. & Salberg, A.-B. Self-constructing graph neural networks to model long-range pixel dependencies for semantic segmentation of remote sensing images. Int. J. Remote Sensing 42, 6184–6208 (2021).
Article Google Scholar
Zhou, S., Zhang, J., Zuo, W. & Loy, C. C. Cross-scale internal graph neural network for image super-resolution. Adv. Neural Inf. Process. Syst. 33, 3499–3509 (2020).
Mou, C. & Zhang, J. Graph attention neural network for image restoration. In 2021 IEEE International Conference on Multimedia and Expo 1–6 (IEEE, 2021).
Qi, S., Wang, W., Jia, B., Shen, J. & Zhu, S. C. Learning human-object interactions by graph parsing neural networks. In Computer Vision – EECV 2018 407–423 (Springer, 2018).
Wang, H., Zheng, W.-s. & Yingbiao, L. Contextual heterogeneous graph network for human–object interaction detection. In Computer Vision—ECCV 2020: Proc. 16th European Conference Part XVII 248–264 (Springer-Verlag, 2020).
Avelar, P. C., Tavares, A. R., da Silveira, T. T., Jung, C. R. & Lamb, L. C. Superpixel image classification with graph attention networks. In 33rd SIBGRAPI Conference on Graphics, Patterns and Images 203–209 (IEEE Computer Society, 2020).
Lu, Y., Chen, Y., Zhao, D. & Chen, J. in Advances in Neural Networks: Lecture Notes in Computer Science Vol. 11554 (eds Lu, H. et al.) 97–105 (Springer, 2019).
Kim, J., Lee, J. K. & Lee, K. M. Deeply-recursive convolutional network for image super-resolution. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1637–1645 (IEEE, 2016).
Achanta, R. et al. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012).
Article Google Scholar
Zeng, H., Liu, Q., Zhang, M., Han, X. & Wang, Y. Semi-supervised hyperspectral image classification with graph clustering convolutional networks. Preprint at https://arxiv.org/abs/2012.10932 (2020).
Wan, S. et al. Multiscale dynamic graph convolutional network for hyperspectral image classification. IEEE Trans. Geosc. Remote Sensing 58, 3162–3177 (2019).
Article Google Scholar
Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 3431–3440 (IEEE, 2015).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) (2015).
Knyazev, B., Lin, X., Amer, M. R. & Taylor, G. W. Image classification with hierarchical multigraph networks. In British Machine Vision Conference (2019).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (2018).
alsesia, D., Fracastoro, G. & Magli, E. Deep graph-convolutional image denoising. In IEEE Trans. Image Process. 8226–8237 (2020).
Bresson, X. & Laurent, T. Residual gated graph ConvNets. Preprint at https://arxiv.org/abs/1711.07553 (2017).
Biten, A. F. et al. Scene text visual question answering. In Proc. IEEE/CVF International Conference on Computer Vision 4291–4301 (2019).
Singh, A. et al. Towards VQA models that can read. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 8317–8326 (IEEE, 2019).
Liu, C. et al. Graph structured network for image-text matching. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10921–10930 (IEEE, 2020).
Zhang, F. Z., Campbell, D. & Gould, S. Spatially conditioned graphs for detecting human–object interactions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13319–13327 (IEEE, 2021).
Ulutan, O., Iftekhar, A. S. M. & Manjunath, B. S. VSGNet: spatial attention network for detecting human object interactions using graph convolutions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13617–13626 (IEEE, 2020).
Gao, C., Xu, J., Zou, Y. & Huang, J.-B. DRG: Dual relation graph for human–object interaction detection. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 696–712 (Springer, 2020).
Zhou, P. & Chi, M. Relation parsing neural network for human–object interaction detection. In Proc. IEEE/CVF International Conference on Computer Vision 843–851 (IEEE, 2019).
Gao, D., Li, K., Wang, R., Shan, S. & Chen, X. Multi-modal graph neural network for joint reasoning on vision and scene text. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12746–12756 (IEEE, 2020).
Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016).
Article Google Scholar
Wu, T. et al. GINet: graph interaction network for scene parsing. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 34–51 (Springer, 2020).
Wu, L. et al. Graph neural networks for natural language processing: a survey. Preprint at https://arxiv.org/abs/2106.06090 (2021).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Li, I., Li, T., Li, Y., Dong, R. & Suzumura, T. Heterogeneous graph neural networks for multi-label text classification. Preprint at https://arxiv.org/abs/2103.14620 (2021).
Huang, L., Ma, D., Li, S., Zhang, X. & Wang, H. Text level graph neural network for text classification. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 3444–3450 (Association for Computational Linguistics, 2019).
Zhang, Y. et al. Every document owns its structure: inductive text classification via graph neural networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 334–339 (Association for Computational Linguistics, 2020).
Pan, J., Peng, M. & Zhang, Y. Mention-centered graph neural network for document-level relation extraction. Preprint at https://arxiv.org/abs/2103.08200 (2021).
Zhu, H. et al. Graph neural networks with generated parameters for relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 1331–1339 (Association for Computational Linguistics, 2019).
Guo, Z., Zhang, Y. & Lu, W. Attention guided graph convolutional networks for relation extraction. In Proc. 57th Annual Meeting of the Association for Computational Linguistics 241–251 (Association for Computational Linguistics, 2019).
Zeng, S., Xu, R., Chang, B. & Li, L. Double graph based reasoning for document-level relation extraction. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 1630–1640 (Association for Computational Linguistics, 2020).
Chen, X. et al. Aspect sentiment classification with document-level sentiment preference modeling. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 3667–3677 (Association for Computational Linguistics, 2020).
Zhang, C., Li, Q. & Song, D. Aspect-based sentiment classification with aspect-specific graph convolutional networks. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 4568–4578 (Association for Computational Linguistics, 2019).
Zhang, M. & Qian, T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 3540–3549 (Association for Computational Linguistics, 2020).
Pouran Ben Veyseh, A. et al. Improving aspect-based sentiment analysis with gated graph convolutional networks and syntax-based regulation. In Findings of the Association for Computational Linguistics: EMNLP 2020 4543–4548 (Association for Computational Linguistics, 2020).
Shlomi, J., Battaglia, P. & Vlimant, J.-R. Graph neural networks in particle physics. Mach. Learn. Sci. Technol. 2, 021001 (2021).
Article Google Scholar
Henrion, I. et al. Neural message passing for jet physics. In Deep Learning for Physical Sciences Workshop at the 31st Conference on Neural Information Processing Systems (2017).
Qasim, S. R., Kieseler, J., Iiyama, Y. & Pierini, M. Learning representations of irregular particle-detector geometry with distance-weighted graph networks. Eur. Phys. J. C 79, 608 (2019).
Mikuni, V. & Canelli, F. ABCnet: an attention-based method for particle tagging. Eur. Phys. J. Plus 135, 463 (2020).
Ju, X. et al. Graph neural networks for particle reconstruction in high energy physics detectors. Preprint at https://arxiv.org/abs/2003.11603 (2020).
Shukla, K., Xu, M., Trask, N. & Karniadakis, G. E. Scalable algorithms for physics-informed neural and graph networks. Data Centric Eng. 3, e24 (2022).
Article Google Scholar
Seo, S. & Liu, Y. Differentiable physics-informed graph networks. Preprint at https://arxiv.org/abs/1902.02950 (2019).
Li, W. & Deka, D. Physics based GNNs for locating faults in power grids. Preprint at https://arxiv.org/abs/2107.02275 (2021).
Battaglia, P. W. et al. Relational inductive biases, deep learning, and graph networks. Preprint at https://arxiv.org/abs/1806.01261 (2018).
Veličković, P., Ying, R., Padovano, M., Hadsell, R. & Blundell, C. Neural execution of graph algorithms. In International Conference on Learning Representations (2020).
Schuetz, M. J. A., Brubaker, J. K. & Katzgraber, H. G. Combinatorial optimization with physics-inspired graph neural networks. Nat. Mach. Intell. 4, 367–377 (2022).
Article Google Scholar
Mirhoseini, A. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021).
Article Google Scholar
Gasteiger, J., Gross, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (2020).
Jørgensen, P. B., Jacobsen, K. W. & Schmidt, M. N. Neural message passing with edge updates for predicting properties of molecules and materials. Preprint at https://arxiv.org/abs/1806.03146 (2018).
Gasteiger, J., Yeshwanth, C. & Günnemann, S. Directional message passing on molecular graphs via synthetic coordinates. Adv. Neural Inf. Process. Syst. 34, 15421–15433 (2021).
Liu, M. et al. Fast quantum property prediction via deeper 2D and 3D graph networks. AI for Science Workshop (NeurIPS, 2021).
St. John, P. C., Guan, Y., Kim, Y., Kim, S. & Paton, R. S. Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost. Nat. Commun. 11, 2328 (2020).
Article Google Scholar
Pattanaik, L. et al. Message passing networks for molecules with tetrahedral chirality. Preprint at https://arxiv.org/abs/2012.00094 (2020).
Fey, M., Yuen, J.-G. & Weichert, F. Hierarchical inter-message passing for learning on molecular graphs. Preprint at https://arxiv.org/abs/2006.12179 (2020).
Ariëns, E. Chirality in bioactive agents and its pitfalls. Trends Pharmacol. Sci. 7, 200–205 (1986).
Article Google Scholar
Guan, Y. et al. Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem. Sci. 12, 2198–2208 (2021).
Article Google Scholar
Coley, C. W. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).
Article Google Scholar
Struble, T. J., Coley, C. W. & Jensen, K. F. Multitask prediction of site selectivity in aromatic C–H functionalization reactions. React. Chem. Eng. 5, 896–902 (2020).
Article Google Scholar
Stuyver, T. & Coley, C. W. Quantum chemistry-augmented neural networks for reactivity prediction: performance, generalizability, and explainability. J. Chem. Phys. 156, 084104 (2022).
Article Google Scholar
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).
Article Google Scholar
Fu, T. et al. Differentiable scaffolding tree for molecule optimization. In International Conference on Learning Representations (2022).
Mercado, R. et al. Graph networks for molecular design. Mach. Learn. Sci. Technol. 2, 025023 (2021).
Article Google Scholar
Torng, W. & Altman, R. B. Graph convolutional neural networks for predicting drug–target interactions. J. Chem. Inf. Model. 59, 4131–4149 (2019).
Article Google Scholar
Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGMnet: a physics-informed deep learning model toward generalized drug-target interaction predictions. Chem. Sci. 13, 3661–3673 (2022).
Article Google Scholar
Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).
Article Google Scholar
Sanner, M. F., Olson, A. J. & Spehner, J.-C. Reduced surface: an efficient way to compute molecular surfaces. Biopolymers 38, 305–320 (1996).
Article Google Scholar
Sverrisson, F., Feydy, J., Correia, B. E. & Bronstein, M. M. Fast end-to-end learning on protein surfaces. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 15272–15281 (IEEE, 2021).
Feng, Y., You, H., Zhang, Z., Ji, R. & Gao, Y. Hypergraph neural networks. Proc. AAAI Conf. Artif. Intell. 33, 3558–3565 (2019).
Google Scholar
Srinivasan, B., Zheng, D. & Karypis, G. Learning over Families of Sets—Hypergraph Representation Learning for Higher Order Tasks 756–764 (SIAM Activity Group on Data Science, 2021).
Jo, J. et al. Edge representation learning with hypergraphs. Adv. Neural Inf. Process. Syst. 34, 7534–7546 (2021).
Zhang, C., Song, D., Huang, C., Swami, A. & Chawla, N. V. Heterogeneous graph neural network. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 793–803 (Association for Computing Machinery, 2019).
Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci. Data 10, 67 (2023).
Lee, S. & Song, B. C. Graph-based knowledge distillation by multi-head attention network. In Proc. British Machine Vision Conference (eds Sidorov, K. & Hicks, Y.) 162.1–162.12 (BMVA, 2019).
Zhou, S. et al. Distilling holistic knowledge with graph neural networks. In Proc. IEEE/CVF International Conference on Computer Vision 10387–10396 (IEEE, 2021).
Sun, L., Gou, J., Yu, B., Du, L. & Tao, D. Collaborative teacher–student learning via multiple knowledge transfer. Preprint at https://arxiv.org/abs/2101.08471 (2021).
Park, W., Kim, D., Lu, Y. & Cho, M. Relational knowledge distillation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 3967–3976 (IEEE, 2019).
Liu, Y. et al. Knowledge distillation via instance relationship graph. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 7096–7104 (IEEE, 2019).
Ma, J. et al. Using deep learning to model the hierarchical structure and function of a cell. Nat. Methods 15, 290–298 (2018).
Article Google Scholar
Borisov, V. et al. Deep neural networks and tabular data: a survey. in IEEE Transactions on Neural Networks and Learning Systems https://doi.org/10.1109/TNNLS.2022.3229161 (2022).
Jiang, D. et al. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J. Cheminform. 13, 12 (2021).
Article Google Scholar
Nicholson, D. N. & Greene, C. S. Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotechnol. J. 18, 1414–1428 (2020).
Article Google Scholar
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations (2019).
Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 30, 1024–1034 (2017).
Xu, K. et al. Representation learning on graphs with jumping knowledge networks. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 5453–5462 (PMLR, 2018).

Download references

Acknowledgements

Y.E., G.D. and M.Z. gratefully acknowledge the support of US Air Force Contract No. FA8702-15-D-0001, and awards from Harvard Data Science Initiative, Amazon Research, Bayer Early Excellence in Science, AstraZeneca Research and Roche Alliance with Distinguished Scientists. Y.E. is supported by grant T32 HG002295 from the National Human Genome Research Institute and the NSDEG fellowship. G.D. is supported by the Harvard Data Science Initiative Postdoctoral Fellowship. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders.

Author information

These authors contributed equally: Yasha Ektefaie, George Dasoulas.

Authors and Affiliations

Bioinformatics and Integrative Genomics Program, Harvard Medical School, Boston, MA, USA
Yasha Ektefaie
Department of Biomedical Informatics, Harvard University, Boston, MA, USA
Yasha Ektefaie, George Dasoulas, Ayush Noori, Maha Farhat & Marinka Zitnik
Harvard Data Science Initiative, Cambridge, MA, USA
George Dasoulas & Marinka Zitnik
Harvard College, Cambridge, MA, USA
Ayush Noori
Division of Pulmonary and Critical Care, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Maha Farhat
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Marinka Zitnik

Authors

Yasha Ektefaie
View author publications
You can also search for this author in PubMed Google Scholar
George Dasoulas
View author publications
You can also search for this author in PubMed Google Scholar
Ayush Noori
View author publications
You can also search for this author in PubMed Google Scholar
Maha Farhat
View author publications
You can also search for this author in PubMed Google Scholar
Marinka Zitnik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marinka Zitnik.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Jianzhu Ma, Ying Ding and Shuiwang Ji for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ektefaie, Y., Dasoulas, G., Noori, A. et al. Multimodal learning with graphs. Nat Mach Intell 5, 340–350 (2023). https://doi.org/10.1038/s42256-023-00624-6

Download citation

Received: 25 September 2022
Accepted: 01 February 2023
Published: 03 April 2023
Issue Date: April 2023
DOI: https://doi.org/10.1038/s42256-023-00624-6

This article is cited by

Integrated visual transformer and flash attention for lip-to-speech generation GAN
- Qiong Yang
- Yuxuan Bai
- Wei Zhang
Scientific Reports (2024)
BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs
- Daniel Daza
- Dimitrios Alivanistos
- Paul Groth
Journal of Biomedical Semantics (2023)
Attention-based graph neural networks: a survey
- Chengcheng Sun
- Chenhao Li
- Zhixiao Wang
Artificial Intelligence Review (2023)