Tehran University of Medical Sciences

Science Communicator Platform

Share By
Two-Level Breast Tumor Classification With Multi-View Imaging and Pathology-Aware Large Language Models: A Fuzzy Knn and Ensemble Learning Framework Publisher



Sharafi Y ; Teshnehlab M ; Sadighi N
Authors

Source: Applied Soft Computing Published:2026


Abstract

Breast cancer is one of the most prevalent cancers among women, and its accurate and timely diagnosis requires the use of intelligent decision-making algorithms. In the present study, a two-level framework based on mammography images and pathological textual data is proposed, with the purpose of enhancing the accuracy of classifying benign and malignant tumors. At the first level, the framework generates three independent representations, namely the HSV color space, heatmap, and thermal color map, from the original mammography images to provide a multi-view representation of the target tissue. For each modality independently, the framework extracts classical texture features such as LBP, GLCM, and HOG, along with deep features derived from pre-trained neural networks. The framework subsequently applies principal component analysis (PCA) separately to each image view to reduce dimensionality and eliminate statistical noise. The proposed framework subsequently feeds the reduced outputs into a Fuzzy k-nearest neighbor's algorithm, which classifies the samples into healthy or suspicious categories based on their imaging features. At the second level, the framework analyzes only the samples identified as suspicious in the first level using textual data. In this stage, the pathology reports corresponding to each sample, after undergoing linguistic preprocessing including cleaning and normalization, are provided to an aggregation-based approach employing large language models, and the process of feature extraction is carried out. The semantic vectors extracted from the large language models undergo PCA-based dimensionality reduction before being fed into a Fuzzy k-NN classifier for final tumor classification (benign/malignant). Experimental results demonstrate that the proposed method achieves high accuracy, shows robustness against noisy data, and effectively manages borderline cases. These findings highlight its potential as a reliable clinical decision-support system for breast cancer diagnosis. © 2026 Elsevier B.V.
Other Related Docs
8. Spatiotemporal Features of Dce-Mri for Breast Cancer Diagnosis, Computer Methods and Programs in Biomedicine (2018)
12. Prediction of Breast Cancer Using Machine Learning Approaches, Journal of Biomedical Physics and Engineering (2022)
14. A Decision Support System for Mammography Reports Interpretation, Health Information Science and Systems (2020)