Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Biased Deep Learning Methods in Detection of Covid-19 Using Ct Images: A Challenge Mounted by Subject-Wise-Split Isfct Dataset Publisher



Parsarad S1, 2 ; Saeedizadeh N1, 3 ; Soufi GJ4 ; Shafieyoon S4 ; Hekmatnia F5 ; Zarei AP5 ; Soleimany S4 ; Yousefi A4 ; Nazari H4 ; Torabi P4 ; S Milani A6 ; Madani Tonekaboni SA7 ; Rabbani H1 ; Hekmatnia A4 Show All Authors
Authors
  1. Parsarad S1, 2
  2. Saeedizadeh N1, 3
  3. Soufi GJ4
  4. Shafieyoon S4
  5. Hekmatnia F5
  6. Zarei AP5
  7. Soleimany S4
  8. Yousefi A4
  9. Nazari H4
  10. Torabi P4
  11. S Milani A6
  12. Madani Tonekaboni SA7
  13. Rabbani H1
  14. Hekmatnia A4
  15. Kafieh R1, 8
Show Affiliations
Authors Affiliations
  1. 1. Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, JM76+5M3, Iran
  2. 2. Law, Economics, and Data Science Group, Department of Humanities, Social and Political Science, ETH Zurich, Zurich, 8092, Switzerland
  3. 3. Institute for Intelligent Systems Research and Innovation, Deakin University, Melbourne, 3125, VIC, Australia
  4. 4. Department of Radiology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, JM76+5M3, Iran
  5. 5. St. George’s Hospital, London, SW17 0RE, United Kingdom
  6. 6. School of Engineering, University of British Columbia, Kelowna, V1V 1V7, BC, Canada
  7. 7. Cyclica Inc, Toronto, M5J 1A7, ON, Canada
  8. 8. Department of Engineering, Durham University, Durham, DH1 3LE, United Kingdom

Source: Journal of Imaging Published:2023


Abstract

Accurate detection of respiratory system damage including COVID-19 is considered one of the crucial applications of deep learning (DL) models using CT images. However, the main shortcoming of the published works has been unreliable reported accuracy and the lack of repeatability with new datasets, mainly due to slice-wise splits of the data, creating dependency between training and test sets due to shared data across the sets. We introduce a new dataset of CT images (ISFCT Dataset) with labels indicating the subject-wise split to train and test our DL algorithms in an unbiased manner. We also use this dataset to validate the real performance of the published works in a subject-wise data split. Another key feature provides more specific labels (eight characteristic lung features) rather than being limited to COVID-19 and healthy labels. We show that the reported high accuracy of the existing models on current slice-wise splits is not repeatable for subject-wise splits, and distribution differences between data splits are demonstrated using t-distribution stochastic neighbor embedding. We indicate that, by examining subject-wise data splitting, less complicated models show competitive results compared to the exiting complicated models, demonstrating that complex models do not necessarily generate accurate and repeatable results. © 2023 by the authors.