Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Comparative Analysis of Deep Learning Architectures for Thyroid Eye Disease Detection Using Facial Photographs Publisher Pubmed



Aghajani A1 ; Rajabi MT1 ; Rafizadeh SM1 ; Zand A1 ; Rezaei M2 ; Shojaeinia M3 ; Rahmanikhah E1
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Oculo-Facial Plastic and Reconstructive Surgery, Farabi Eye Hospital, Tehran University of Medical Sciences, Qazvin Square, Tehran, Iran
  2. 2. Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran
  3. 3. Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Source: BMC Ophthalmology Published:2025


Abstract

Purpose: To compare two artificial intelligence (AI) models, residual neural networks ResNet-50 and ResNet-101, for screening thyroid eye disease (TED) using frontal face photographs, and to test these models under clinical conditions. Methods: A total of 1601 face photographs were obtained. These photographs were preprocessed by cropping to a region centered around the eyes. For the deep learning process, photographs from 643 TED patients and 643 healthy individuals were used for training the ResNet models. Additionally, 81 photographs of TED patients and 74 of normal subjects were used as the validation dataset. Finally, 80 TED cases and 80 healthy subjects comprised the test dataset. For application tests under clinical conditions, data from 25 TED patients and 25 healthy individuals were utilized to evaluate the non-inferiority of the AI models, with general ophthalmologists and fellowships as the control group. Results: In the test set verification of the ResNet-50 AI model, the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, and specificity were 0.94, 0.88, 0.64, and 0.92, respectively. For the ResNet-101 AI model, these metrics were 0.93, 0.84, 0.76, and 0.92, respectively. In the application tests under clinical conditions, to evaluate the non-inferiority of the ResNet-50 AI model, the AUC, accuracy, sensitivity, and specificity were 0.82, 0.82, 0.88, and 0.76, respectively. For the ResNet-101 AI model, these metrics were 0.91, 0.84, 0.92, and 0.76, respectively, with no statistically significant differences between the two models for any of the metrics (all p-values > 0.05). Conclusions: Face image-based TED screening using ResNet-50 and ResNet-101 AI models shows acceptable accuracy, sensitivity, and specificity for distinguishing TED from healthy subjects. © The Author(s) 2025.