Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination

The data that support the findings of this study are available from the corresponding author upon reasonable request.


  1. Lee P, Bubeck S, Petro J. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med. 2023;388:1233–39.

  2. Ali R, Tang OY, Connolly ID, Zadnik Sullivan PL, Shin JH, Fridley JS, et al. Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. medRxiv. 2023;2:e0000198.

  3. Nori H, King N, McKinney SM, Carignan D, Horvitzet E. Capabilities of GPT-4 on medical challenge problems. arXiv. 2023.

  4. Fares A. Evaluating the performance of ChatGPT in ophthalmology: an analysis of its successes and shortcomings. medRxiv. 2023:2023.01.22.23284882.

  5. American Academy of Ophthalmology. Board prep resources for ophthalmology residents San Francisco, CA: American Academy of Ophthalmology; 2023. Available from: accessed April, 2023.

The authors thank Rohaid Ali, MD of Brown University and Ian Connolly, MD, MS of Massachusetts General Hospital for their contributions to this study’s design.


John Lin was awarded departmental funding from Brown University for expenses related to this study.

All authors were responsible for conceptualization and research design; JCL, DNY, and SSK were involved in data acquisition and research execution; JCL, DNY, and OYT conducted the data analysis; all authors worked on data interpretation and manuscript preparation.

Correspondence to Ingrid U. Scott.

The authors declare no competing interests.

Lin, J.C., Younessi, D.N., Kurapati, S.S. et al. Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination. Eye 37, 3694–3695 (2023).

