Abstract
ChatGPT, an artificial intelligence (AI) chatbot built on large language models (LLMs), has rapidly gained popularity. The benefits and limitations of this transformative technology have been discussed across various fields, including medicine. The widespread availability of ChatGPT has enabled clinicians to study how these tools could be used for a variety of tasks such as generating differential diagnosis lists, organizing patient notes, and synthesizing literature for scientific research. LLMs have shown promising capabilities in ophthalmology by performing well on the Ophthalmic Knowledge Assessment Program, providing fairly accurate responses to questions about retinal diseases, and in generating differential diagnoses list. There are current limitations to this technology, including the propensity of LLMs to “hallucinate”, or confidently generate false information; their potential role in perpetuating biases in medicine; and the challenges in incorporating LLMs into research without allowing “AI-plagiarism” or publication of false information. In this paper, we provide a balanced overview of what LLMs are and introduce some of the LLMs that have been generated in the past few years. We discuss recent literature evaluating the role of these language models in medicine with a focus on ChatGPT. The field of AI is fast-paced, and new applications based on LLMs are being generated rapidly; therefore, it is important for ophthalmologists to be aware of how this technology works and how it may impact patient care. Here, we discuss the benefits, limitations, and future advancements of LLMs in patient care and research.
摘要
基于大型语言模型(LLM)的人工智能聊天机器人ChatGPT已迅速普及。这项变革性技术的优势和局限性在包括医学在内的各个领域引致了广泛讨论。ChatGPT的广泛应用使得临床医生能够将这些工具用于各种任务, 例如生成鉴别诊断清单、整理病人记录以及为科学研究整合文献。LLM通过在眼科知识评估项目表现良好、对视网膜疾病问题可提供准确的回答, 并在鉴别诊断方面显示出在眼科领域应用前景的能力。这项技术目前存在局限性, 包括LLM的“幻觉”倾向, 或自信地生成虚假信息; 在医学偏倚方面存在潜在作用; 以及在不允许“人工智能剽窃”或发表虚假信息的情况下将LLM纳入研究等方面的挑战。在本文中, 我们对什么是LLM提供中立的概述, 介绍了过去几年产生的一些LLM。我们讨论了评估这些语言模型在医学中作用的最新文献, 重点是ChatGPT。人工智能领域发展快速, 基于LLM的新的应用层出不穷; 因此, 眼科医生了解这项技术的工作原理以及它对患者治疗的影响非常重要。在此, 我们对LLM在患者治疗和研究方面的优势、局限性及未来发展进行讨论。
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 18 print issues and online access
$259.00 per year
only $14.39 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama. 2016;316:2402–10. https://doi.org/10.1001/jama.2016.17216.
Ting D, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R. et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103:167–75. https://doi.org/10.1136/bjophthalmol-2018-313173.
Keskinbora K, Guven F. Artificial intelligence and ophthalmology. Turk J Ophthalmol. 2020;50:37–43. https://doi.org/10.4274/tjo.galenos.2020.78989.
Ong J, Selvam A, Chhablani J. Artificial intelligence in ophthalmology: optimization of machine learning for ophthalmic care and research. Clin Exp Ophthalmol. 2021;49:413–5. https://doi.org/10.1111/ceo.13952.
OpenAI. Introducing ChatGPT, https://openai.com/blog/chatgpt (2022).
OpenAI. GPT-4 Technical Report. ArXiv abs/2303.08774 (2023).
Tools such as ChatGPT threaten transparent science; here are our ground rules for their use. Nature. 2023;613:612, https://doi.org/10.1038/d41586-023-00191-1.
The Lancet Digital, H. ChatGPT: friend or foe?. Lancet Digit Health. 2023;5:e102 https://doi.org/10.1016/s2589-7500(23)00023-7.
Will ChatGPT transform healthcare? Nat Med. 2023;29:505–6, https://doi.org/10.1038/s41591-023-02289-5.
Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, et al. ChatGPT and other large language models are double-edged swords. Radiology. 2023;307:230163, https://doi.org/10.1148/radiol.230163.
Lee P, Bubeck S, Petro J. Benefits, limits, and risks of GPT-4 as an AI Chatbot for medicine. N. Engl J Med. 2023;388:1233–9. https://doi.org/10.1056/NEJMsr2214184.
Ong J, Hariprasad SM, Chhablani J. ChatGPT and GPT-4 in ophthalmology: applications of large language model artificial intelligence in retina. Ophthalmic Surg Lasers Imaging Retin. 2023;54:557–62. https://doi.org/10.3928/23258160-20230926-01.
Kojima T, Gu SS, Reid M, Matsuo, Y & Iwasawa, Y. Large language models are zero-shot reasoners. ArXiv abs/2205.11916 (2022).
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017;30:1–15.
OpenAI. Model index for researchers. 2023 https://platform.openai.com/docs/model-index-for-researchers.
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877–901.
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright CL, Mishkin P, et al. Training language models to follow instructions with human feedback. Adv Neural Inf Process Syst. 2022;35:27730–44.
Wang C, Ong J, Wang C, Ong H, Cheng R, Ong D. Potential for GPT technology to optimize future clinical decision-making using retrieval-augmented generation. Ann Biomed Eng. 2023 https://doi.org/10.1007/s10439-023-03327-6.
Elicit. 2023 https://elicit.org/.
Sanjeev S. Meet SightBot: ChatGPT-powered research insights with pubmed citations. 2023 https://www.brilliantly.ai/blog/sightbot.
Abhinav Venigalla JF, Carbin M. BioMedLM: a Domain-Specific Large Language Model for Biomedical Text. 2022 https://www.mosaicml.com/blog/introducing-pubmed-gpt.
Yasunaga M, Bosselut A, Ren H, Zhang X, Manning CD, Liang P, et al. Deep bidirectional language-knowledge graph pretraining. Adv Neural Inf Process Syst. 2022;35:37309–23.
Yasunaga M, Leskovec J & Liang P Linkbert: Pretraining language models with document links. arXiv preprint arXiv:2203.15827 (2022), 8003–16.
Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc (HEALTH). 2021;3:1–23.
Shin HC, Zhang Y, Bakhturina E, Puri R, Patwary M, Shoeybi M, et al. BioMegatron: Larger biomedical domain language model. arXiv preprint arXiv:2010.06060 2020; 4700–6.
Luo R, Sun L, Xia Y, Qin T, Zhang S, Poon H, et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefi Bioinform. 2022;23:1–12.
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C. et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2:e0000198 https://doi.org/10.1371/journal.pdig.0000198.
Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol Sci. 2023;3:100324 https://doi.org/10.1016/j.xops.2023.100324.
Antaki, F, Touma, S, Milad, D, El-Khoury, J & Duval, R. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. medRxiv. 2023; 2023.2001.2022.23284882, https://doi.org/10.1101/2023.01.22.23284882.
Teebagy S, Colwell L, Wood E, Yaghy A, Faustina M. Improved performance of ChatGPT-4 on the OKAP examination: a comparative study with ChatGPT-3.5. J Acad Ophthalmol (2017). 2023;15:e184–e187. https://doi.org/10.1055/s-0043-1774399.
Delsoz M, Raja H, Madadi Y, Tang AA, Wirostko BM, Kahook MY. et al. The use of ChatGPT to assist in diagnosing glaucoma based on clinical case reports. Ophthalmol Ther. 2023;12:3121–32. https://doi.org/10.1007/s40123-023-00805-x.
OpenAI. GPT-4. 2023 https://openai.com/research/gpt-4.
Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large language models encode clinical knowledge. arXiv preprint arXiv:2212.13138 (2022).
Matias Y & Corrado G. Our latest health AI research updates. 2023 https://blog.google/technology/health/ai-llm-medpalm-research-thecheckup/.
Rao A, Kim J, Kamineni M, Pang M, Lie W, Succi MD. Evaluating ChatGPT as an adjunct for radiologic decision-making. medRxiv. 2023 https://doi.org/10.1101/2023.02.02.23285399.
Hirosawa T, Harada Y, Yokose M, Sakamoto T, Kawamura R, Shimizu T. Diagnostic accuracy of differential-diagnosis lists generated by generative pretrained transformer 3 chatbot for clinical vignettes with common chief complaints: a pilot study. Int J Environ Res Public Health. 2023;20, https://doi.org/10.3390/ijerph20043378.
Liu S, Wright AP, Patterson BL, Wanderer JP, Turer RW, Nelson SD, et al. Assessing the value of ChatGPT for clinical decision support optimization. medRxiv. 2023 https://doi.org/10.1101/2023.02.21.23286254.
Michael B, Edsel BI. Conversational AI Models for ophthalmic diagnosis: comparison of ChatGPT and the isabel pro differential diagnosis generator. JFO Open Ophthalmol. 2023;1:100005 https://doi.org/10.1016/j.jfop.2023.100005.
Potapenko I, Boberg-Ans LC, Stormly Hansen M, Klefter ON, van Dijk E, Subhi Y. Artificial intelligence-based chatbot patient information on common retinal diseases using ChatGPT. Acta Ophthalmol. 2023;101:829–31, https://doi.org/10.1111/aos.15661.
Cocco AM, Zordan R, Taylor DM, Weiland TJ, Dilley SJ, Kant J. et al. Dr Google in the ED: searching for online health information by adult emergency department patients. Med J Aust. 2018;209:342–7. https://doi.org/10.5694/mja17.00889.
Ong H, Ong J, Cheng R, Wang C, Lin M, Ong D. GPT technology to help address longstanding barriers to care in free medical clinics. Ann Biomed Eng. 2023;51:1906–9. https://doi.org/10.1007/s10439-023-03256-4.
AlRyalat SA & Kahook MY. The use of artificial intelligence chatbots in ophthalmology. 2022 https://www.glaucomaphysician.net/issues/2022/december-2022/the-use-of-artificial-intelligence-chatbots-in-oph.
Parikh D, Armstrong G, Liou V, Husain D. Advances in telemedicine in ophthalmology. Semin Ophthalmol. 2020;35:210–5. https://doi.org/10.1080/08820538.2020.1789675.
Mudie LI, Patnaik JL, Gill Z, Wagner M, Christopher KL, Seibold LK. et al. Disparities in eye clinic patient encounters among patients requiring language interpreter services. BMC Ophthalmol. 2023;23:82 https://doi.org/10.1186/s12886-022-02756-6.
Nesher R, Ever-Hadani P, Epstein E, Stern Y, Assia E. Overcoming the language barrier in visual field testing. J Glaucoma. 2001;10:203–5. https://doi.org/10.1097/00061198-200106000-00010.
Read-Brown S, Hribar MR, Reznick LG, Lombardi LH, Parikh M, Chamberlain WD. et al. Time requirements for electronic health record use in an academic ophthalmology center. JAMA Ophthalmol. 2017;135:1250–7. https://doi.org/10.1001/jamaophthalmol.2017.4187.
Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47:33 https://doi.org/10.1007/s10916-023-01925-4.
Patel SB, Lam K. ChatGPT: the future of discharge summaries?. Lancet Digit Health. 2023;5:e107–e108. https://doi.org/10.1016/S2589-7500(23)00021-3.
Microsoft and Epic expand strategic collaboration with integration of Azure OpenAI Service. 2023 https://news.microsoft.com/2023/04/17/microsoft-and-epic-expand-strategic-collaboration-with-integration-of-azure-openai-service/.
Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M. et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035 https://doi.org/10.1038/sdata.2016.35.
Huang K, Altosaar J & Ranganath R. Clinicalbert: modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv. 2019;1904.05342.
Yang X, Chen A, PourNejatian N, Shin HC, Smith KE, Parisien C. et al. A large language model for electronic health records. NPJ Digit Med. 2022;5:194 https://doi.org/10.1038/s41746-022-00742-2.
Wang S, Zhao Z, Ouyang X, Wang Q & Shen D. ChatCAD: interactive computer-aided diagnosis on medical image using large language models. ArXiv. 2023;abs/2302.07257.
Huang S, Dong L, Wang W, Hao Y, Singhal S, Ma S, et al. Language is not all you need: Aligning perception with language models. arXiv preprint arXiv.2302.14045 2023.
Be My Eyes. 2023 https://openai.com/customer-stories/be-my-eyes.
Azamfirei R, Kudchadkar SR, Fackler J. Large language models and the perils of their hallucinations. Crit Care. 2023;27:120 https://doi.org/10.1186/s13054-023-04393-x.
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large language models encode clinical knowledge. ArXiv. 2022 abs/2212.13138
Baumgartner C. The potential impact of ChatGPT in clinical and translational medicine. Clin Transl Med. 2023;13:e1206 https://doi.org/10.1002/ctm2.1206.
Berkowitz ST, Groth SL, Gangaputra S, Patel S. Racial/ethnic disparities in ophthalmology clinical trials resulting in us food and drug administration drug approvals from 2000 to 2020. JAMA Ophthalmol. 2021;139:629–37. https://doi.org/10.1001/jamaophthalmol.2021.0857.
Zambelli-Weiner A, Crews JE, Friedman DS. Disparities in adult vision health in the United States. Am J Ophthalmol. 2012;154:S23–S30.e21. https://doi.org/10.1016/j.ajo.2012.03.018.
Zhang H, Lu AX, Abdalla M, McDermott M & Ghassemi M in proceedings of the ACM Conference on Health, Inference, and Learning. 110–20.
Beltagy I, Lo K & Cohan A SciBERT: a pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019), 3613–8.
Pal R, Garg H, Patel S & Sethi T. Bias amplification in intersectional subpopulations for clinical phenotyping by large language models. medRxiv. 2023; 2023.2003.2022.23287585, https://doi.org/10.1101/2023.03.22.23287585.
Edwards H & Storkey A. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015).
Elazar Y & Goldberg Y. Adversarial removal of demographic attributes from text data. arXiv preprint arXiv:1808.06640 (2018), 11–21.
Chen JS, Lin WC, Yang S, Chiang MF, Hribar MR. Development of an open-source annotated glaucoma medication dataset from clinical notes in the electronic health record. Transl Vis Sci Technol. 2022;11:20 https://doi.org/10.1167/tvst.11.11.20.
Salvagno M, Taccone FS, Gerli AG. Can artificial intelligence help for scientific writing?. Crit Care. 2023;27:75 https://doi.org/10.1186/s13054-023-04380-2.
Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare. 2023;11:887.
Homolak J. Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern Promethean dilemma. Croat Med J. 2023;64:1–3. https://doi.org/10.3325/cmj.2023.64.1.
Hutson M. Could AI help you to write your next paper? Nature. 2022;611:192–3. https://doi.org/10.1038/d41586-022-03479-w.
Dahmen J, Kayaalp ME, Ollivier M, Pareek A, Hirschmann MT, Karlsson J. et al. Artificial intelligence bot ChatGPT in medical research: the potential game changer as a double-edged sword. Knee Surg, Sports Traumatol, Arthrosc. 2023;31:1187–9. https://doi.org/10.1007/s00167-023-07355-6.
Owens B. How nature readers are using ChatGPT. Nature. 2023;615:20 https://doi.org/10.1038/d41586-023-00500-8.
Wang S, Scells H, Koopman B & Zuccon G. Can ChatGPT write a good boolean query for systematic review literature search? ArXiv. 2023; abs/2302.03495
Yu ZL, Hu XY, Wang YN, Ma Z. Scientometric analysis of published papers in global ophthalmology in the past ten years. Int J Ophthalmol. 2017;10:1898–901. https://doi.org/10.18240/ijo.2017.12.17.
Chen JS, Baxter SL. Applications of natural language processing in ophthalmology: present and future. Front Med (Lausanne). 2022;9:906554 https://doi.org/10.3389/fmed.2022.906554.
Else H. Abstracts written by ChatGPT fool scientists. Nature. 2023;613:423 https://doi.org/10.1038/d41586-023-00056-7.
Faisal RE, Leena NR. AI-generated research paper fabrication and plagiarism in the scientific community. Patterns. 2023;4:100706 https://doi.org/10.1016/j.patter.2023.100706.
King MR.chatGpt. A conversation on artificial intelligence, chatbots, and plagiarism in higher education. Cell Mol Bioeng. 2023;16:1–2. https://doi.org/10.1007/s12195-022-00754-8.
Zhavoronkov A. Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective. Oncoscience. 2022;9:82–84. https://doi.org/10.18632/oncoscience.571.
Elsevier. The Use of AI and AI-assisted Technologies in Scientific Writing. 2023 https://www.elsevier.com/about/policies/publishing-ethics/the-use-of-ai-and-ai-assisted-writing-technologies-in-scientific-writing.
Eye. Guide to Authors. 2023 https://www.nature.com/eye/authors-and-referees/gta.
Srinivasan N, Zhou B, Taruvai V, Nadkarni S, Song A, Khouri AS. Catching eyes: an analysis of medical student publications in the ophthalmology match. Investig Ophthalmol Vis Sci. 2021;62:2660–2660.
Protopsaltis NJ, Chen AJ, Hwang V, Gedde SJ, Chao DL. Success in attaining independent funding among national institutes of health K grant awardees in ophthalmology: an extended follow-up. JAMA Ophthalmol. 2018;136:1335–40. https://doi.org/10.1001/jamaophthalmol.2018.3887.
Alkaissi H, McFarlane SI. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus. 2023;15:e35179 https://doi.org/10.7759/cureus.35179.
Ong J, Hariprasad SM, Chhablani J. A guide to accessible artificial intelligence and machine learning for the 21st century retina specialist. Ophthalmic Surg Lasers Imaging Retin. 2021;52:361–5. https://doi.org/10.3928/23258160-20210628-01.
Ali MJ & Djalilian A. Readership Awareness Series – Paper 4: Chatbots and ChatGPT - Ethical Considerations in Scientific Publications. Seminars in Ophthalmology. 2023;38:1–2 https://doi.org/10.1080/08820538.2023.2193444.
Nori H, King N, McKinney SM, Carignan D & Horvitz E. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375 (2023).
Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–270. https://doi.org/10.1093/nar/gkh061.
Roberts, A Exploring Transfer Learning with T5: the Text-To-Text Transfer Transformer. 2020 https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html.
Acknowledgements
The authors would like to thank their colleagues who helped test SightBot.
Funding
This research has received no external funding from any agency.
Author information
Authors and Affiliations
Contributions
Writing: NK, SS, JO, JC; Review and Editing: NK, SS, JO, JC; Final Approval of Manuscript: NK, SS, JO, JC.
Corresponding author
Ethics declarations
Competing interests
The custom-generated chatbot mentioned in this article, SightBot, was developed by Suvansh Sanjeev and Nikita Kedia under advisement from Jay Chhablani, and was used as a demo for Suvansh’s company, Brilliantly AI (https://brilliantly.ai). The chatbot is not revenue generating.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kedia, N., Sanjeev, S., Ong, J. et al. ChatGPT and Beyond: An overview of the growing field of large language models and their use in ophthalmology. Eye 38, 1252–1261 (2024). https://doi.org/10.1038/s41433-023-02915-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41433-023-02915-z