Tehran University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! By
Machine Learning-Based Predictive Modeling of Angina Pectoris in an Elderly Community-Dwelling Population: Results From the Pocosteo Study Publisher Pubmed



S Mousavi SHAHROKH ; Z Jalalian ZAHRASADAT ; S Afrashteh SIMA ; A Farhadi AKRAM ; I Nabipour IRAJ ; Ba Larijani Bagher A
Authors

Source: PLOS ONE Published:2025


Abstract

Background Angina pectoris, a comparatively common complaint among older adults, is a critical warning sign of underlying coronary heart disease. We aimed to develop machine learning-based models using multiple algorithms to predict and identify the predictors of angina pectoris in an elderly community-dwelling population. Methods Medical records of 2000 participants in the PoCOsteo study between 2018 and 2021 were analyzed. The Rose Angina Questionnaire was used to indicate angina pectoris. Preprocessing was performed using imputation and scaling methods. We developed the following models: logistic regression (LR), multilayer perceptron (MLP), support vector machine (SVM), k-nearest neighbors (KNN), linear and quadratic discriminant analysis (LDA, QDA), decision tree (DT), and two ensemble models: random forest (RF) and adaptive boosting (AdaBoost). To address model complexity and parameter uncertainty, we performed hyperparameter tuning and compared the trade-offs between model performance and interpretability, in addition to applying tenfold cross-validation. To determine the importance of each feature as a measure of their contribution to the models’ performance, we conducted the permutation feature importance technique. Results With a mean age of 62.15 years (± 8.07) and 57.1% being female, 88.4% of the participants did not have angina, 3.6% had probable angina, and 8% had definite angina. The bivariate analysis revealed significant correlations between RAQ and several other variables. LDA, RF, and LR had the highest AUC values, averaging 0.772, 0.770, and 0.764, respectively. These three models outperformed QDA (AUC 0.752), SVM (0.733), AdaBoost (0.726), KNN (0.697), MLP (0.697), and DT (0.644). Permutation feature importance revealed a handful of features that implicated the role of thrombotic vascular diseases, congestive heart failure, renal failure, and anemia. Discussion Our study demonstrated that LDA, RF, and LR not only provided strong predictive performance but also balanced model complexity with interpretability. The superior performance of these models could be largely attributed to their ability to capture the relevant linear, nonlinear, and interaction effects inherent in the clinical data, as well as the clinical relevance of key predictors like thrombotic vascular diseases, congestive heart failure, renal failure, and anemia. Future studies could incorporate more direct diagnostic methods to test our findings further and enhance the robustness of the predictive models developed. © 2025 Elsevier B.V., All rights reserved.
Other Related Docs