Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
A Novel Dynamic Bayesian Network Approach for Data Mining and Survival Data Analysis Publisher Pubmed



Sheidaei A1 ; Foroushani AR1 ; Gohari K2 ; Zeraati H1
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Pour Sina St., Keshavarz Blvd., Tehran, 14176-13151, Iran
  2. 2. Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran

Source: BMC Medical Informatics and Decision Making Published:2022


Abstract

Background: Censorship is the primary challenge in survival modeling, especially in human health studies. The classical methods have been limited by applications like Kaplan–Meier or restricted assumptions like the Cox regression model. On the other hand, Machine learning algorithms commonly rely on the high dimensionality of data and ignore the censorship attribute. In addition, these algorithms are more sophisticated to understand and utilize. We propose a novel approach based on the Bayesian network to address these issues. Methods: We proposed a two-slice temporal Bayesian network model for the survival data, introducing the survival and censorship status in each observed time as the dynamic states. A score-based algorithm learned the structure of the directed acyclic graph. The likelihood approach conducted parameter learning. We conducted a simulation study to assess the performance of our model in comparison with the Kaplan–Meier and Cox proportional hazard regression. We defined various scenarios according to the sample size, censoring rate, and shapes of survival and censoring distributions across time. Finally, we fit the model on a real-world dataset that includes 760 post gastrectomy surgery due to gastric cancer. The validation of the model was explored using the hold-out technique based on the posterior classification error. Our survival model performance results were compared using the Kaplan–Meier and Cox proportional hazard models. Results: The simulation study shows the superiority of DBN in bias reduction for many scenarios compared with Cox regression and Kaplan–Meier, especially in the late survival times. In the real-world data, the structure of the dynamic Bayesian network model satisfied the finding from Kaplan–Meier and Cox regression classical approaches. The posterior classification error found from the validation technique did not exceed 0.04, representing that our network predicted the state variables with more than 96% accuracy. Conclusions: Our proposed dynamic Bayesian network model could be used as a data mining technique in the context of survival data analysis. The advantages of this approach are feature selection ability, straightforward interpretation, handling of high-dimensional data, and few assumptions. © 2022, The Author(s).
Related Docs
Experts (# of related papers)