Loading...

Brain Stroke Prediction using Explainable Machine Learning and Time Series Feature Engineering

  • Home
  • Publications
  • Brain Stroke Prediction using Explainable Machine Learning and Time Series Feature Engineering
Brain Stroke Prediction using Explainable Machine Learning and Time Series Feature Engineering

Brain Stroke Prediction using Explainable Machine Learning and Time Series Feature Engineering

Published: March 24, 2026 View External Link

Overview

IEEE Xplore 21 January 2025 Publisher: IEEE

Detailed Description

Abstract


A brain stroke happens when blood flow to a part of the brain is interrupted or reduced. It is a critical medical condition that demands timely detection to prevent severe outcomes, including permanent paralysis and death. This research focuses on predicting brain stroke using machine learning (ML) and Explainable Artificial Intelligence (XAI). We employ a comprehensive dataset featuring various parameters associated with brain stroke such as age, Body Mass Index (BMI), average glucose, smoking, etc. and related diseases like hypertension and heart disease. Our objective is to develop an automated system that can accurately predict the possibility of a brain stroke. To achieve this, we utilized several classification ML models, including Extreme Gradient Boosting (XGB), Random Forest (RF), K-Nearest Neighbors (KNN), Gradient Boosting Decision Trees (GBDT), Adaptive Boosting (AdaB), and Categorical Boosting (CatB). Additionally, we incorporated time series feature engineering (TSFE) techniques, such as Lag Features (LF) and Rolling Statistics (RS), to enhance prediction accuracy. Our research introduces a cross-validation (CV) approach, highlighting that the XGB model with Rolling Statistics (XGB-RS) stands out as the top performer. This model achieved impressive results, with average metrics of 98.06% accuracy, 90.57% precision, 88.46% recall, 99.12% specificity, and an F1 score of 89.4%. Additionally, it utilizes a Confusion Matrix (CM) and Receiver Operating Characteristic (ROC) Curve. To interpret these outcomes, we employ Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive Explanations (SHAP), underscoring the importance of understanding the contributing factors. This research provides valuable insights into improving long-term health conditions, reducing mortality rates, and supporting healthcare professionals in managing brain stroke.