Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Classification of Potential Breast/Colorectal Cancer Cases Using Machine Learning Methods Publisher



Jafarpour M1 ; Moeini A1 ; Maryami N2 ; Nahvijou A3 ; Mohammadian A4
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Algorithms and Computation, School of Engineering Sciences, College of Engineering, University of Tehran, Tehran, Iran
  2. 2. Student Research Committee, Zanjan University of Medical Sciences, Zanjan, Iran
  3. 3. Cancer Research Center, Cancer Institute of Iran, Tehran University of Medical Sciences, Tehran, Iran
  4. 4. Department of Information Technology Management, Faculty of Management, University of Tehran, Iran

Source: International Journal of Cancer Management Published:2023


Abstract

Background: The algorithmic classification of infected and healthy individuals by gene expression has been a topic of interest to researchers in numerous domains, including cancer. Several studies have presented numerous solutions, such as neural networks and support vector machines (SVMs), to classify a diverse range of cancer cases. Such classifications have provided some degrees of accuracy, which highly depend on optimization approaches and suitable kernels. Objectives: This study aimed at proposing a method to classify cancer-prone and healthy cases under breast cancer and colorectal cancer (CRC), using machine learning methods efficiently, increasing the accuracy of the classification process. Methods: This study presented an algorithm to diagnose individuals prone to breast cancer and CRC. The novelty of this algorithm lies in its suitable kernel and the feature extraction approach. By the application of this algorithm, this study first identified the genes closely associated with these types of cancers and, then, tried to find individuals susceptible to the concerned cancers using SVM. The present study highlighted the indirect gene expressions associated with these cancers, which might show health status complications for the patients. To this end, the algorithm consists of SVMs in conjunction with the k-fold method for validation. Results: The results confirmed the superior performance of this approach, compared to the common neural networks. The algorithm’s identification accuracy values were 98.077% and 99.806% for breast cancer and CRC, respectively. The graphic representation of the cause-effect relationships was also provided to help researchers better understand the trend of cancer or other types of diseases. Conclusions: The feature extraction method highly affects the accuracy of the classification. In addition, relying on indirect disease-triggering genes’ expressions highlights a cause-effect relationship between genes and diseases. Such relationships can form Markov models in the clinical domain leading to treatment paths and prediction of patient outcomes. © 2023, Author(s).
Related Docs