Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! By
Machine Learning-Based Optimization for N-Hexane Removal Prediction From Air Streams in Biofilter: A Focus on Interpretability and Feature Interactions Publisher



M Baziar MANSOUR ; Ak Behnami Ali KARIMZADEH ; N Jafari NEGAR ; Mtg Mokhtari Mehdi Taghi GHANEIAN ; Y Hajizadeh YAGHOUB ; A Abdolahnejad ALI
Authors

Source: Environmental Technology and Innovation Published:2025


Abstract

N-Hexane, a volatile organic compound (VOC), is commonly used as a solvent in industries such as cleaning, printing, and food processing. However, upon occupational exposure, it poses significant health risks including neuronal damage and motor coordination issues. This study utilizes the Spider Monkey Optimization (SMO) algorithm to enhance machine learning models (ML) for predicting n-hexane removal from air streams in a biofilter. SMO improves model performance through effective hyperparameter tuning. Four hybrid models including LSSVM (Least Squares Support Vector Machine)-SMO, CatBoost (Categorical Boosting)-SMO, RF (Random Forest)-SMO, and XGB (eXtreme Gradient Boosting)-SMO were evaluated, demonstrating SMO's capability to enhance accuracy and generalization in predicting n-hexane removal. Among these, XGB-SMO achieved the best performance with n_spiders = 20, n_iterations = 50, alpha = 0.5, and beta = 0.5, resulting in an R² of 1.0000, NSE of 1.0000, and MSE of 0.0007 on the training set, and an R² of 0.9947, NSE of 0.9947, and MSE of 2.9603 on the testing set, indicating exceptional accuracy and generalizability. Feature importance analysis using SHAP (SHapley Additive exPlanations) identified EBRT(s) as the most influential factor, followed by biosurfactant concentration (BC) (mg L−1), inlet loading rate (IL) (g m[sbnd]3 h[sbnd]1), temperature (°C), and pH, with pH showing minimal impact. The SHAP summary plot emphasized EBRT(s) and BC (mg L−1) as critical factors, illustrating XGBoost-SMO's ability to capture complex data relationships. These findings highlight the robustness of XGBoost-SMO in accurately predicting n-hexane removal and the effectiveness of SMO in optimizing ML models. © 2025 Elsevier B.V., All rights reserved.