DSpace Repository

Web-based Stroke Prediction Using Explainable Machine Learning

Show simple item record

dc.contributor.author Ordoñez, Marie Ashley C.
dc.date.accessioned 2025-08-15T01:50:51Z
dc.date.available 2025-08-15T01:50:51Z
dc.date.issued 2025-07
dc.identifier.uri http://dspace.cas.upm.edu.ph:8080/xmlui/handle/123456789/3135
dc.description.abstract Stroke is a cerebrovascular disease caused by an infarction or hemorrhage in the brain, potentially leaving irreversible tissue damage, loss of neurons, and physiological damage. Furthermore, it is the third leading cause of death in the Philippines as of 2023, with a higher incidence in younger adults. Due to gaps in stroke care — shortage of neurologists, diagnostic machines, and stroke protocol — there is a need for a transparent decision support tool to diagnose stroke that is easily accessible and feasible for community-based programs. Only a few local tools use explainable AI (XAI) to predict stroke incidence based on modifiable and nonmodifiable factors, and most stroke prediction models do not have XAI. This study used various machine learning techniques to develop a classifier to predict stroke incidence to integrate into a web application. The models used were Random Forest (RF), Support Vector Machine (SVM), XGBoost (XGB), 1D Convolutional Neural Network (CNN), and EasyEnsemble Classifier (EEC). The missing values were imputed using KNN imputation and mode imputation, numerical variables were Z-scaled, categorical variables were one-hot encoded and ordinal encoded, and the study explored and compared various imbalance handling methods, namely Random Undersampling (RUS), SMOTE-NC, and SMOTE-RUS. Furthermore, hyperparameter tuning with 10-fold stratified cross-validation was used to attempt to improve model performance. Results showed that the EEC classifier with RUS and hybrid imputation was the best model, with 91.72% recall and 0.4797 AUCPR. Shapley Additive Explanations (SHAP) results show that age was the most important feature in the model, with stroke incidence increasing by at most 12% due to old age and followed by at most 2% due to higher average glucose level. Finally, the model was integrated into a web application with a Local Interpretable Model-agnostic Explanations (LIME) explainer to establish transparency between the model and its users by showing local feature importance for every prediction. en_US
dc.subject Stroke Prediction en_US
dc.subject Machine Learning en_US
dc.subject Explainable AI (XAI) en_US
dc.subject Shapley Additive Explanations (SHAP) en_US
dc.subject Local Interpretable Model-Agnostic Explanations (Lime) en_US
dc.subject Convolutional Neural Network (CNN) en_US
dc.title Web-based Stroke Prediction Using Explainable Machine Learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account