Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Performance of Chatgpt in Diagnosis of Corneal Eye Diseases Publisher Pubmed



Delsoz M1 ; Madadi Y1 ; Raja H1 ; Munir WM2 ; Tamm B2 ; Mehravaran S3 ; Soleimani M4, 5 ; Djalilian A4 ; Yousefi S1, 6
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Ophthalmology, Hamilton Eye Institute, University of Tennessee Health Science Center, Memphis, TN, United States
  2. 2. Department of Ophthalmology and Visual Sciences, University of Maryland School of Medicine, Baltimore, MD, United States
  3. 3. Department of Biology, School of Computer, Mathematical, and Natural Sciences, Morgan State University, Baltimore, MD, United States
  4. 4. Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, IL, United States
  5. 5. Eye Research Center, Farabi Eye Hospital, Tehran University of Medical Sciences, Tehran, Iran
  6. 6. Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States

Source: Cornea Published:2024


Abstract

Purpose:The aim of this study was to assess the capabilities of ChatGPT-4.0 and ChatGPT-3.5 for diagnosing corneal eye diseases based on case reports and compare with human experts.Methods:We randomly selected 20 cases of corneal diseases including corneal infections, dystrophies, and degenerations from a publicly accessible online database from the University of Iowa. We then input the text of each case description into ChatGPT-4.0 and ChatGPT-3.5 and asked for a provisional diagnosis. We finally evaluated the responses based on the correct diagnoses, compared them with the diagnoses made by 3 corneal specialists (human experts), and evaluated interobserver agreements.Results:The provisional diagnosis accuracy based on ChatGPT-4.0 was 85% (17 correct of 20 cases), whereas the accuracy of ChatGPT-3.5 was 60% (12 correct cases of 20). The accuracy of 3 corneal specialists compared with ChatGPT-4.0 and ChatGPT-3.5 was 100% (20 cases, P = 0.23, P = 0.0033), 90% (18 cases, P = 0.99, P = 0.6), and 90% (18 cases, P = 0.99, P = 0.6), respectively. The interobserver agreement between ChatGPT-4.0 and ChatGPT-3.5 was 65% (13 cases), whereas the interobserver agreement between ChatGPT-4.0 and 3 corneal specialists was 85% (17 cases), 80% (16 cases), and 75% (15 cases), respectively. However, the interobserver agreement between ChatGPT-3.5 and each of 3 corneal specialists was 60% (12 cases).Conclusions:The accuracy of ChatGPT-4.0 in diagnosing patients with various corneal conditions was markedly improved than ChatGPT-3.5 and promising for potential clinical integration. A balanced approach that combines artificial intelligence-generated insights with clinical expertise holds a key role for unveiling its full potential in eye care. © 2024 Lippincott Williams and Wilkins. All rights reserved.