Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Modeling Epidemiology Data With Machine Learning Technique to Detect Risk Factors for Gastric Cancer Publisher Pubmed



Mohammadnezhad K1 ; Sahebi MR1 ; Alatab S2 ; Sadjadi A3
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Photogrammetry and Remote Sensing, Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran, 19967-15433, Iran
  2. 2. Digestive Disease Research Center, Digestive Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran
  3. 3. Digestive Oncology Research Center, Digestive Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran

Source: Journal of Gastrointestinal Cancer Published:2024


Abstract

Purpose: Gastriccancer (GC) ranks as the 7th most common cancer worldwide and a leading cause of cancer mortality. In Iran, stomach malignancies are the most common fatal cancers with higher than world average incidence. In recent years, methods like machine learning that provide the opportunity of merging health issues with computational power and learning capacity have caught considerable attention for prediction and diagnosis of diseases. In this study, we aimed to model GC data to find risk factors and identify GC cases in Golestan Cohort Study (GCS), using gradient boosting as a machine learning technique. Methods: Since the GC class (280) was smaller than not-GC (49,467), “Synthetic Minority Oversampling Technique” was used to balance the dataset. Seventy percent of the data was used to train the gradient boosting algorithm and find effective factors on gastric cancer, and the remaining 30% was used for accuracy assessment. Results: Our results indicated that out of 19 factors, age, social economical status, tea temperature, body mass index, gender, and education were the top six effective factors with impact rates of 0.24, 0.16, 0.13, 0.13, and 0.07, respectively. The trained model classified 70 out of 72 GC patients in the test set, correctly. Conclusion: The results indicate that this model can effectively detect gastric cancer (GC) by utilizing important risk factors, thus avoiding the need for invasive procedures. The model’s performance is reliable when provided with an adequate amount of input data, and as the dataset expands, its accuracy and generalization improve significantly. Overall, the trained system’s success stems from its ability to identify risk factors and identify cancer patients. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.
Other Related Docs
13. The Stomach Cancer Pooling (Stop) Project: Study Design and Presentation, European Journal of Cancer Prevention (2015)
19. Turmeric, Pepper, Cinnamon, and Saffron Consumption and Mortality, Journal of the American Heart Association (2019)