Abstract
Artificial intelligence for graphs has achieved remarkable success in modelling complex systems, ranging from dynamic networks in biology to interacting particle systems in physics. However, the increasingly heterogeneous graph datasets call for multimodal methods that can combine different inductive biases — assumptions that algorithms use to make predictions for inputs they have not encountered during training. Learning on multimodal datasets is challenging because the inductive biases can vary by data modality and graphs might not be explicitly given in the input. To address these challenges, graph artificial intelligence methods combine different modalities while leveraging cross-modal dependencies through geometric relationships. Diverse datasets are combined using graphs and fed into sophisticated multimodal architectures, specified as image-intensive, knowledge-grounded and language-intensive models. Using this categorization, we introduce a blueprint for multimodal graph learning, use it to study existing methods and provide guidelines to design new models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
We summarize MGL methods and provide a continually updated summary at https://yashaektefaie.github.io/mgl. We host a live table where MGL methods are added to provide an evolving resource for the community.
References
Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23, 40–55 (2022).
Yu, M. K. et al. Visible machine learning for biomedicine. Cell 173, 1562–1565 (2018).
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2017).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 70 (eds Precup, D. & Teh, Y. W.) 1263–1272 (PMLR, 2017).
Sanchez-Gonzalez, A. et al. Graph networks as learnable physics engines for inference and control. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 4470–4479 (PMLR, 2018).
Sanchez-Gonzalez, A. et al. Learning to simulate complex physics with graph networks. In Proc. 37th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 119 (eds Daumé, H. III & Singh, A.) 8459–8468 (PMLR, 2020).
Liu, Q., Kusner, M. J. & Blunsom, P. A survey on contextual embeddings. Preprint at https://arxiv.org/abs/2003.07278 (2020).
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2009).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (2017).
Kipf, T. N. & Welling, M. Variational graph auto-encoders. In NIPS Workshop on Bayesian Deep Learning (2016).
Grover, A., Zweig, A. & Ermon, S. Graphite: iterative generative modeling of graphs. In Proc. 36th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 972 (eds Chaudhuri, K. & Salakhutdinov, R.) 434–2444 (PMLR, 2019).
Guo, X. & Zhao, L. A systematic survey on deep generative models for graph generation. Preprint at https://arxiv.org/abs/2007.06686 (2020).
Baltrusaitis, T., Ahuja, C. & Morency. L-P. Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2019).
Hong, C., Yu, J., Wan, J., Tao, D. & Wang, M. Multimodal deep autoencoder for human pose recovery. IEEE Trans. Image Process. 24, 5659–5670 (2015).
Khattar, D., Goud, J. S., Gupta, M. & Varma, V. MVAE: multimodal variational autoencoder for fake news detection. In The World Wide Web Conference 2915–2921 (Association for Computing Machinery, 2019).
Mao, J., Xu, J., Jing, Y. & Yuille, A. Training and evaluating multimodal word embeddings with large-scale web annotated images. In Proc. 30th International Conference on Neural Information Processing Systems 442–450 (Curran Associates, 2016).
Huang, Y., Lin, J., Zhou, C., Yang, H. & Huang, L. Modality competition: what makes joint training of multi-modal network fail in deep learning? (Provably). In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9226–9259 (PMLR, 2022).
Xu, P., Zhu, X. & Clifton, D. A. Multimodal learning with transformers: a survey. Preprint at https://arxiv.org/abs/2206.06488 (2022).
Bayoudh, K., Knani, R., Hamdaoui, F. & Mtibaa, A. A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Vis.Comput. 38, 2939–2970 (2022).
Zhang, C., Yang, Z., He, X. & Deng, L. Multimodal intelligence: representation learning, information fusion, and applications. IEEE J. Sel. Top. Signal Process. 14, 478–493 (2020).
Javaloy, A., Meghdadi, M. & Valera, I. Mitigating modality collapse in multimodal VAEs via impartial optimization. In Proc. 39th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 162 (eds Chaudhuri, K. et al.) 9938–9964 (PMLR, 2022).
Ma, M. et al. SMIL: multimodal learning with severely missing modality. Proc. AAAI Conf. Artif. Intell. 35, 2302–2310 (2021).
Poklukar, P. et al. Geometric multimodal contrastive representation learning. in Proc. Mach. Learn. Res. 162, 17782–17800 (2022).
Zitnik, M. et al. Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inf. Fusion 50, 71–91 (2019).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Somnath, V. R., Bunne, C. & Krause, A. Multi-scale representation learning on proteins. Adv. Neural Inf. Process. Syst. 34 (2021).
Walters, W. P. & Barzilay, R. Applications of deep learning in molecule generation and molecular property prediction. Acc. Chem. Res. 54, 263–270 (2021).
Wang, J., Hu, J., Qian, S., Fang, Q. & Xu, C. Multimodal graph convolutional networks for high quality content recognition. Neurocomputing 412, 42–51 (2020).
Mai, S., Hu, H. & Xing, S. Modality to modality translation: an adversarial representation learning and graph fusion network for multimodal fusion. Proc. AAAI Conf. Artif. Intell. 34, 164–172 (2020).
Zhang, X., Zeman, M., Tsiligkaridis, T. & Zitnik, M. Graph-guided network for irregularly sampled multivariate time series. In International Conference on Learning Representations (2022).
Zhao, F. & Wang, D. Multimodal Graph Meta Contrastive Learning 3657–3661 (Association for Computing Machinery, 2021).
Zheng, S. et al., Multi-modal graph learning for disease prediction. in IEEE Trans. Med. Imaging 41, 2207–2216 (2022).
Ramachandram, D. & Taylor, G. W. Deep multimodal learning: a survey on recent advances and trends. IEEE Signal Process. Mag. 34, 96–108 (2017).
Ngiam, J. et al. Multimodal deep learning. In Proc. 28th International Conference on International Conference on Machine Learning 689–696 (Omnipress, 2011).
Aafaq, N., Akhtar, N., Liu, W., Gilani, S. Z. & Mian, A. Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12487–12496 (IEEE, 2019).
Fang, Z., Gokhale, T., Banerjee, P., Baral, C. & Yang, Y. Video2Commonsense: generating commonsense descriptions to enrich video captioning. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 840–860 (Association for Computational Linguistics, 2020).
Kiros, R., Salakhutdinov, R. & Zemel, R. Multimodal neural language models. In Proc. 31st International Conference on Machine Learning: Proc. Machine Learning Research Vol. 32 (eds Xing, E. P. & Jebara, T.) 595–603 (PMLR, 2014).
Rezaei-Shoshtari, S., Hogan, F. R., Jenkin, M., Meger, D. & Dudek, G. Learning intuitive physics with multimodal generative models. Proc. AAAI Conf. Artif. Intell. 35, 6110–6118 (2021).
Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://arxiv.org/abs/2104.13478 (2021).
Chen, Y. et al. Graph-based global reasoning networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 433–442 (IEEE, 2019).
Varga, V. & Lorincz, A. Fast interactive video object segmentation with graph neural networks. In International Joint Conference on Neural Networks 1–10 (IEEE, 2021).
Liu, Q., Kampffmeyer, M., Jenssen, R. & Salberg, A.-B. Self-constructing graph neural networks to model long-range pixel dependencies for semantic segmentation of remote sensing images. Int. J. Remote Sensing 42, 6184–6208 (2021).
Zhou, S., Zhang, J., Zuo, W. & Loy, C. C. Cross-scale internal graph neural network for image super-resolution. Adv. Neural Inf. Process. Syst. 33, 3499–3509 (2020).
Mou, C. & Zhang, J. Graph attention neural network for image restoration. In 2021 IEEE International Conference on Multimedia and Expo 1–6 (IEEE, 2021).
Qi, S., Wang, W., Jia, B., Shen, J. & Zhu, S. C. Learning human-object interactions by graph parsing neural networks. In Computer Vision – EECV 2018 407–423 (Springer, 2018).
Wang, H., Zheng, W.-s. & Yingbiao, L. Contextual heterogeneous graph network for human–object interaction detection. In Computer Vision—ECCV 2020: Proc. 16th European Conference Part XVII 248–264 (Springer-Verlag, 2020).
Avelar, P. C., Tavares, A. R., da Silveira, T. T., Jung, C. R. & Lamb, L. C. Superpixel image classification with graph attention networks. In 33rd SIBGRAPI Conference on Graphics, Patterns and Images 203–209 (IEEE Computer Society, 2020).
Lu, Y., Chen, Y., Zhao, D. & Chen, J. in Advances in Neural Networks: Lecture Notes in Computer Science Vol. 11554 (eds Lu, H. et al.) 97–105 (Springer, 2019).
Kim, J., Lee, J. K. & Lee, K. M. Deeply-recursive convolutional network for image super-resolution. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1637–1645 (IEEE, 2016).
Achanta, R. et al. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012).
Zeng, H., Liu, Q., Zhang, M., Han, X. & Wang, Y. Semi-supervised hyperspectral image classification with graph clustering convolutional networks. Preprint at https://arxiv.org/abs/2012.10932 (2020).
Wan, S. et al. Multiscale dynamic graph convolutional network for hyperspectral image classification. IEEE Trans. Geosc. Remote Sensing 58, 3162–3177 (2019).
Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 3431–3440 (IEEE, 2015).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) (2015).
Knyazev, B., Lin, X., Amer, M. R. & Taylor, G. W. Image classification with hierarchical multigraph networks. In British Machine Vision Conference (2019).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (2018).
alsesia, D., Fracastoro, G. & Magli, E. Deep graph-convolutional image denoising. In IEEE Trans. Image Process. 8226–8237 (2020).
Bresson, X. & Laurent, T. Residual gated graph ConvNets. Preprint at https://arxiv.org/abs/1711.07553 (2017).
Biten, A. F. et al. Scene text visual question answering. In Proc. IEEE/CVF International Conference on Computer Vision 4291–4301 (2019).
Singh, A. et al. Towards VQA models that can read. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 8317–8326 (IEEE, 2019).
Liu, C. et al. Graph structured network for image-text matching. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 10921–10930 (IEEE, 2020).
Zhang, F. Z., Campbell, D. & Gould, S. Spatially conditioned graphs for detecting human–object interactions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13319–13327 (IEEE, 2021).
Ulutan, O., Iftekhar, A. S. M. & Manjunath, B. S. VSGNet: spatial attention network for detecting human object interactions using graph convolutions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13617–13626 (IEEE, 2020).
Gao, C., Xu, J., Zou, Y. & Huang, J.-B. DRG: Dual relation graph for human–object interaction detection. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 696–712 (Springer, 2020).
Zhou, P. & Chi, M. Relation parsing neural network for human–object interaction detection. In Proc. IEEE/CVF International Conference on Computer Vision 843–851 (IEEE, 2019).
Gao, D., Li, K., Wang, R., Shan, S. & Chen, X. Multi-modal graph neural network for joint reasoning on vision and scene text. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12746–12756 (IEEE, 2020).
Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016).
Wu, T. et al. GINet: graph interaction network for scene parsing. In Computer Vision—ECCV 2020 (eds Vedaldi, A. et al.) 34–51 (Springer, 2020).
Wu, L. et al. Graph neural networks for natural language processing: a survey. Preprint at https://arxiv.org/abs/2106.06090 (2021).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Li, I., Li, T., Li, Y., Dong, R. & Suzumura, T. Heterogeneous graph neural networks for multi-label text classification. Preprint at https://arxiv.org/abs/2103.14620 (2021).
Huang, L., Ma, D., Li, S., Zhang, X. & Wang, H. Text level graph neural network for text classification. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 3444–3450 (Association for Computational Linguistics, 2019).
Zhang, Y. et al. Every document owns its structure: inductive text classification via graph neural networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 334–339 (Association for Computational Linguistics, 2020).
Pan, J., Peng, M. & Zhang, Y. Mention-centered graph neural network for document-level relation extraction. Preprint at https://arxiv.org/abs/2103.08200 (2021).
Zhu, H. et al. Graph neural networks with generated parameters for relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 1331–1339 (Association for Computational Linguistics, 2019).
Guo, Z., Zhang, Y. & Lu, W. Attention guided graph convolutional networks for relation extraction. In Proc. 57th Annual Meeting of the Association for Computational Linguistics 241–251 (Association for Computational Linguistics, 2019).
Zeng, S., Xu, R., Chang, B. & Li, L. Double graph based reasoning for document-level relation extraction. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 1630–1640 (Association for Computational Linguistics, 2020).
Chen, X. et al. Aspect sentiment classification with document-level sentiment preference modeling. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 3667–3677 (Association for Computational Linguistics, 2020).
Zhang, C., Li, Q. & Song, D. Aspect-based sentiment classification with aspect-specific graph convolutional networks. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing 4568–4578 (Association for Computational Linguistics, 2019).
Zhang, M. & Qian, T. Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 3540–3549 (Association for Computational Linguistics, 2020).
Pouran Ben Veyseh, A. et al. Improving aspect-based sentiment analysis with gated graph convolutional networks and syntax-based regulation. In Findings of the Association for Computational Linguistics: EMNLP 2020 4543–4548 (Association for Computational Linguistics, 2020).
Shlomi, J., Battaglia, P. & Vlimant, J.-R. Graph neural networks in particle physics. Mach. Learn. Sci. Technol. 2, 021001 (2021).
Henrion, I. et al. Neural message passing for jet physics. In Deep Learning for Physical Sciences Workshop at the 31st Conference on Neural Information Processing Systems (2017).
Qasim, S. R., Kieseler, J., Iiyama, Y. & Pierini, M. Learning representations of irregular particle-detector geometry with distance-weighted graph networks. Eur. Phys. J. C 79, 608 (2019).
Mikuni, V. & Canelli, F. ABCnet: an attention-based method for particle tagging. Eur. Phys. J. Plus 135, 463 (2020).
Ju, X. et al. Graph neural networks for particle reconstruction in high energy physics detectors. Preprint at https://arxiv.org/abs/2003.11603 (2020).
Shukla, K., Xu, M., Trask, N. & Karniadakis, G. E. Scalable algorithms for physics-informed neural and graph networks. Data Centric Eng. 3, e24 (2022).
Seo, S. & Liu, Y. Differentiable physics-informed graph networks. Preprint at https://arxiv.org/abs/1902.02950 (2019).
Li, W. & Deka, D. Physics based GNNs for locating faults in power grids. Preprint at https://arxiv.org/abs/2107.02275 (2021).
Battaglia, P. W. et al. Relational inductive biases, deep learning, and graph networks. Preprint at https://arxiv.org/abs/1806.01261 (2018).
Veličković, P., Ying, R., Padovano, M., Hadsell, R. & Blundell, C. Neural execution of graph algorithms. In International Conference on Learning Representations (2020).
Schuetz, M. J. A., Brubaker, J. K. & Katzgraber, H. G. Combinatorial optimization with physics-inspired graph neural networks. Nat. Mach. Intell. 4, 367–377 (2022).
Mirhoseini, A. et al. A graph placement methodology for fast chip design. Nature 594, 207–212 (2021).
Gasteiger, J., Gross, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (2020).
Jørgensen, P. B., Jacobsen, K. W. & Schmidt, M. N. Neural message passing with edge updates for predicting properties of molecules and materials. Preprint at https://arxiv.org/abs/1806.03146 (2018).
Gasteiger, J., Yeshwanth, C. & Günnemann, S. Directional message passing on molecular graphs via synthetic coordinates. Adv. Neural Inf. Process. Syst. 34, 15421–15433 (2021).
Liu, M. et al. Fast quantum property prediction via deeper 2D and 3D graph networks. AI for Science Workshop (NeurIPS, 2021).
St. John, P. C., Guan, Y., Kim, Y., Kim, S. & Paton, R. S. Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost. Nat. Commun. 11, 2328 (2020).
Pattanaik, L. et al. Message passing networks for molecules with tetrahedral chirality. Preprint at https://arxiv.org/abs/2012.00094 (2020).
Fey, M., Yuen, J.-G. & Weichert, F. Hierarchical inter-message passing for learning on molecular graphs. Preprint at https://arxiv.org/abs/2006.12179 (2020).
Ariëns, E. Chirality in bioactive agents and its pitfalls. Trends Pharmacol. Sci. 7, 200–205 (1986).
Guan, Y. et al. Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem. Sci. 12, 2198–2208 (2021).
Coley, C. W. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).
Struble, T. J., Coley, C. W. & Jensen, K. F. Multitask prediction of site selectivity in aromatic C–H functionalization reactions. React. Chem. Eng. 5, 896–902 (2020).
Stuyver, T. & Coley, C. W. Quantum chemistry-augmented neural networks for reactivity prediction: performance, generalizability, and explainability. J. Chem. Phys. 156, 084104 (2022).
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).
Fu, T. et al. Differentiable scaffolding tree for molecule optimization. In International Conference on Learning Representations (2022).
Mercado, R. et al. Graph networks for molecular design. Mach. Learn. Sci. Technol. 2, 025023 (2021).
Torng, W. & Altman, R. B. Graph convolutional neural networks for predicting drug–target interactions. J. Chem. Inf. Model. 59, 4131–4149 (2019).
Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGMnet: a physics-informed deep learning model toward generalized drug-target interaction predictions. Chem. Sci. 13, 3661–3673 (2022).
Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).
Sanner, M. F., Olson, A. J. & Spehner, J.-C. Reduced surface: an efficient way to compute molecular surfaces. Biopolymers 38, 305–320 (1996).
Sverrisson, F., Feydy, J., Correia, B. E. & Bronstein, M. M. Fast end-to-end learning on protein surfaces. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 15272–15281 (IEEE, 2021).
Feng, Y., You, H., Zhang, Z., Ji, R. & Gao, Y. Hypergraph neural networks. Proc. AAAI Conf. Artif. Intell. 33, 3558–3565 (2019).
Srinivasan, B., Zheng, D. & Karypis, G. Learning over Families of Sets—Hypergraph Representation Learning for Higher Order Tasks 756–764 (SIAM Activity Group on Data Science, 2021).
Jo, J. et al. Edge representation learning with hypergraphs. Adv. Neural Inf. Process. Syst. 34, 7534–7546 (2021).
Zhang, C., Song, D., Huang, C., Swami, A. & Chawla, N. V. Heterogeneous graph neural network. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 793–803 (Association for Computing Machinery, 2019).
Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci. Data 10, 67 (2023).
Lee, S. & Song, B. C. Graph-based knowledge distillation by multi-head attention network. In Proc. British Machine Vision Conference (eds Sidorov, K. & Hicks, Y.) 162.1–162.12 (BMVA, 2019).
Zhou, S. et al. Distilling holistic knowledge with graph neural networks. In Proc. IEEE/CVF International Conference on Computer Vision 10387–10396 (IEEE, 2021).
Sun, L., Gou, J., Yu, B., Du, L. & Tao, D. Collaborative teacher–student learning via multiple knowledge transfer. Preprint at https://arxiv.org/abs/2101.08471 (2021).
Park, W., Kim, D., Lu, Y. & Cho, M. Relational knowledge distillation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 3967–3976 (IEEE, 2019).
Liu, Y. et al. Knowledge distillation via instance relationship graph. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 7096–7104 (IEEE, 2019).
Ma, J. et al. Using deep learning to model the hierarchical structure and function of a cell. Nat. Methods 15, 290–298 (2018).
Borisov, V. et al. Deep neural networks and tabular data: a survey. in IEEE Transactions on Neural Networks and Learning Systems https://doi.org/10.1109/TNNLS.2022.3229161 (2022).
Jiang, D. et al. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J. Cheminform. 13, 12 (2021).
Nicholson, D. N. & Greene, C. S. Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotechnol. J. 18, 1414–1428 (2020).
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations (2019).
Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 30, 1024–1034 (2017).
Xu, K. et al. Representation learning on graphs with jumping knowledge networks. In Proc. 35th International Conference on Machine Learning: Proc. Machine Learning Research Vol. 80 (eds Dy, J. & Krause, A.) 5453–5462 (PMLR, 2018).
Acknowledgements
Y.E., G.D. and M.Z. gratefully acknowledge the support of US Air Force Contract No. FA8702-15-D-0001, and awards from Harvard Data Science Initiative, Amazon Research, Bayer Early Excellence in Science, AstraZeneca Research and Roche Alliance with Distinguished Scientists. Y.E. is supported by grant T32 HG002295 from the National Human Genome Research Institute and the NSDEG fellowship. G.D. is supported by the Harvard Data Science Initiative Postdoctoral Fellowship. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Jianzhu Ma, Ying Ding and Shuiwang Ji for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Notes 1 and 2.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ektefaie, Y., Dasoulas, G., Noori, A. et al. Multimodal learning with graphs. Nat Mach Intell 5, 340–350 (2023). https://doi.org/10.1038/s42256-023-00624-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00624-6
This article is cited by
-
Integrated visual transformer and flash attention for lip-to-speech generation GAN
Scientific Reports (2024)
-
BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs
Journal of Biomedical Semantics (2023)
-
Attention-based graph neural networks: a survey
Artificial Intelligence Review (2023)