Authors:Esha Chowdhury (Dhaka University of Engineering & Technology Gazipur, Bangladesh)
Abstract:Accurate prediction of cardiovascular disease (CVD) risk is crucial for healthcare institutions. This study addresses the growing prevalence of diabetes and its strong link to heart disease by proposing an efficient CVD risk prediction model for diabetic patients using machine learning (ML) and hybrid deep learning (DL) approaches. The BRFSS dataset was preprocessed by removing duplicates, handling missing values, identifying categorical and numerical features, and applying Principal Component Analysis (PCA) for feature extraction. Several ML models, including Deci…
Authors:Esha Chowdhury (Dhaka University of Engineering & Technology Gazipur, Bangladesh)
Abstract:Accurate prediction of cardiovascular disease (CVD) risk is crucial for healthcare institutions. This study addresses the growing prevalence of diabetes and its strong link to heart disease by proposing an efficient CVD risk prediction model for diabetic patients using machine learning (ML) and hybrid deep learning (DL) approaches. The BRFSS dataset was preprocessed by removing duplicates, handling missing values, identifying categorical and numerical features, and applying Principal Component Analysis (PCA) for feature extraction. Several ML models, including Decision Trees (DT), Random Forest (RF), k-Nearest Neighbors (KNN), Support Vector Machine (SVM), AdaBoost, and XGBoost, were implemented, with XGBoost achieving the highest accuracy of 0.9050. Various DL models, such as Artificial Neural Networks (ANN), Deep Neural Networks (DNN), Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Gated Recurrent Unit (GRU), as well as hybrid models combining CNN with LSTM, BiLSTM, and GRU, were also explored. Some of these models achieved perfect recall (1.00), with the LSTM model achieving the highest accuracy of 0.9050. Our research highlights the effectiveness of ML and DL models in predicting CVD risk among diabetic patients, automating and enhancing clinical decision-making. High accuracy and F1 scores demonstrate these models’ potential to improve personalized risk management and preventive strategies.
| Comments: | 24 pages with 6 table and 8 figures |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2511.04971 [cs.LG] |
| (or arXiv:2511.04971v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2511.04971 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Esha Chowdhury [view email] [v1] Fri, 7 Nov 2025 04:14:30 UTC (505 KB)