Multimodal large language models have been recognized as a historical milestone in the field of artificial intelligence and have demonstrated revolutionary potentials not only in commercial applications, but also for many scientific fields. Here we give a brief overview of multimodal large language models through the lens of bioimage analysis and discuss how we could build these models as a community to facilitate biology research.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Kaplan, J. et al. Preprint at https://doi.org/10.48550/arXiv.2001.08361 (2020).
Ngiam, J. et al. Multimodal deep learning. In ICML’11: Proc 28th International Conf. on Machine Learning (eds Getoor, L. & Scheffer, T.) 689–696 (Omnipress, 2011).
Brown, T. et al. Language models are few-shot learners. In Adv. Neural Inf. Process. Syst. 33 (eds.) (2020).
Kirillov, A. et al. Segment anything. In Proc. IEEE/CVF International Conf. on Computer Vision, 4015–4026 (IEEE, 2023).
Alvelid, J., Damenti, M., Sgattoni, C. & Testa, I. Nat. Methods 19, 1268–1275 (2022).
Royer, L. A. Nat. Methods https://doi.org/10.1038/s41592-024-02310-w (2024).
Carpenter, A. E., Cimini, B. A. & Eliceiri, K. W. Nat. Methods 20, 962–964 (2023).
Strack, R. Nat. Methods 17, 23 (2020).
Ma, C., Tan, W., He, R. & Yan, B. Nat. Methods https://doi.org/10.1038/s41592-024-02244-3 (2024).
Archit, A. et al. (2023). Preprint at bioRxiv https://doi.org/10.1101/2023.08.21.554208 (2023)
Cui, H. et al. Nat. Methods https://doi.org/10.1038/s41592-024-02201-0 (2024).
Schaar, A. C. et al. Preprint at bioRxiv https://doi.org/10.1101/2024.04.15.589472 (2024).
Patel, J. M. Getting Structured Data from the Internet: Running Web Crawlers/Scrapers on a Big Data Production Scale (Apress, 2020).
Liu, K. & Prabhakar, V. Preprint at bioRxiv https://doi.org/10.1101/2023.10.31.565037 (2023).
Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Adv. Neural Inf. Process. Syst. 33 (eds) 9459–9474 (2020).
Ding, N. et al. Nat. Mach. Intell. 5, 220–235 (2023).
Acknowledgements
S.Z. is supported by the National Science and Technology Major Project of China (No. 2022ZD0117801). J.C. is funded by the Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF) in Germany under the funding reference 161L0272, and also supported by the Ministry of Culture and Science of the State of North Rhine-Westphalia (Ministerium für Kultur und Wissenschaft des Landes Nordrhein-Westfalen, MKW NRW).
Author information
Authors and Affiliations
Contributions
S.Z. proposed the idea and paper framework, and G.D., T.H. & J.C. joined the discussion. All the authors wrote, edited and gave final approval to the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Rights and permissions
About this article
Cite this article
Zhang, S., Dai, G., Huang, T. et al. Multimodal large language models for bioimage analysis. Nat Methods 21, 1390–1393 (2024). https://doi.org/10.1038/s41592-024-02334-2
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-024-02334-2
This article is cited by
-
Embedding AI in biology
Nature Methods (2024)