Enhancing Cardiovascular Disease Prediction Accuracy through an Ensemble Machine Learning Approach
Abstract
This study explores the efficacy of an ensemble machine learning approach, specifically a Voting Classifier combining Decision Tree, k-Nearest Neighbors, and Gaussian Naive Bayes classifiers, in predicting cardiovascular diseases (CVDs). Utilizing a dataset consisting of 70,000 clinical records, the model was rigorously tested through 5-fold cross-validation, achieving remarkable results with average accuracies, precision, recall, and F1-scores all exceeding 99%. The findings validate the hypothesis that ensemble models, due to their capacity to leverage multiple learning algorithms, provide superior prediction accuracy and reliability compared to single predictor models. This research not only confirms the effectiveness of ensemble methods in medical diagnostics but also highlights their potential to enhance decision-making in clinical settings. Given the model's success in identifying various stages of cardiovascular conditions with high accuracy, it offers significant implications for early intervention and personalized patient management. Future research should aim to validate these results across more diverse populations and explore the integration of additional predictive factors that could refine the model's applicability. This study contributes to the computational health field by demonstrating how advanced machine learning techniques can be effectively applied in predicting health outcomes.
References
F. T. Admojo and N. Rismayanti, “Estimating Obesity Levels Using Decision Trees and K-Fold Cross-Validation: A Study on Eating Habits and Physical Conditions,” Indones. J. Data …, 2024.
R. Setiawan and H. Oumarou, “Classification of Rice Grain Varieties Using Ensemble Learning and Image Analysis Techniques,” Indones. J. Data …, 2024.
A. Sinra and H. Angriani, “Automated Classification of COVID-19 Chest X-ray Images Using Ensemble Machine Learning Methods,” Indones. J. Data Sci., 2024.
I. P. A. Pratama, E. S. J. Atmadji, and ..., “Evaluating the Performance of Voting Classifier in Multiclass Classification of Dry Bean Varieties,” Indones. J. …, 2024.
A. Anitha, “Disease prediction and knowledge extraction in banana crop cultivation using decision tree classifiers,” Int. J. Bus. Intell. Data Min., vol. 20, no. 1, pp. 107–120, 2022, doi: 10.1504/IJBIDM.2022.119957.
X. Hu, “K-Nearest Neighbor Estimation of Functional Nonparametric Regression Model under NA Samples,” Axioms, vol. 11, no. 3, 2022, doi: 10.3390/axioms11030102.
K. Sen, “Heart Disease Prediction Using a Soft Voting Ensemble of Gradient Boosting Models, RandomForest, and Gaussian Naive Bayes,” 2023 4th Int. Conf. Emerg. Technol. INCET 2023, 2023, doi: 10.1109/INCET57972.2023.10170399.
I. Alwiah, U. Zaky, and A. W. Murdiyanto, “Assessing the Predictive Power of Logistic Regression on Liver Disease Prevalence in the Indian Context,” … J. Data Sci., 2024.
M. D. Genemo, “Federated Learning for Bronchus Cancer Detection Using Tiny Machine Learning Edge Devices,” Indones. J. Data Sci., 2024.
H. Oumarou and N. Rismayanti, “Automated Classification of Empon Plants: A Comparative Study Using Hu Moments and K-NN Algorithm,” Indones. J. Data …, 2023.
I. A. P. Banlawe, “Decision Tree Learning Algorithm and Naïve Bayes Classifier Algorithm Comparative Classification for Mango Pulp Weevil Mating Activity,” 2021 IEEE Int. Conf. Autom. Control Intell. Syst. I2CACIS 2021 - Proc., pp. 317–322, 2021, doi: 10.1109/I2CACIS52118.2021.9495863.
M. Aqib, “Classification of Edge Applications using Decision Tree, K-NN, & SVM Classifier,” 2022 IEEE Students Conf. Eng. Syst. SCES 2022, 2022, doi: 10.1109/SCES55490.2022.9887690.
S. Naiem, “Enhancing the Efficiency of Gaussian Naïve Bayes Machine Learning Classifier in the Detection of DDOS in Cloud Computing,” IEEE Access, vol. 11, pp. 124597–124608, 2023, doi: 10.1109/ACCESS.2023.3328951.
M. Rafało, “Cross validation methods: Analysis based on diagnostics of thyroid cancer metastasis,” ICT Express, vol. 8, no. 2, pp. 183–188, 2022, doi: 10.1016/j.icte.2021.05.001.
O. Karal, “Performance comparison of different kernel functions in SVM for different k value in k-fold cross-validation,” Proc. - 2020 Innov. Intell. Syst. Appl. Conf. ASYU 2020, 2020, doi: 10.1109/ASYU50717.2020.9259880.
S. Ortiz-Toquero, “Classification of Keratoconus Based on Anterior Corneal High-order Aberrations: A Cross-validation Study,” Optom. Vis. Sci., vol. 97, no. 3, pp. 169–177, 2020, doi: 10.1097/OPX.0000000000001489.
N. Rismayanti, A. Naswin, U. Zaky, M. Zakariyah, and D. A. Purnamasari, “Evaluating Thresholding-Based Segmentation and Humoment Feature Extraction in Acute Lymphoblastic Leukemia Classification using Gaussian Naive Bayes,” Int. J. Artif. Intell. Med. Issues, vol. 1, no. 2, 2023.
A. Naswin and A. P. Wibowo, “Performance Analysis of the Decision Tree Classification Algorithm on the Pneumonia Dataset,” … Artif. Intell. Med. …, 2023.
T. E. Tarigan, E. Susanti, M. I. Siami, I. Arfiani, and ..., “Performance Metrics of AdaBoost and Random Forest in Multi-Class Eye Disease Identification: An Imbalanced Dataset Approach,” … Artif. Intell. …, 2023.
R. A. Azdy, R. F. Syam, E. Faizal, and ..., “Performance Evaluation of Bagging Meta-Estimator in Lung Disease Detection: A Case Study on Imbalanced Dataset,” Int. J. …, 2023.
A. Maulidinnawati, “Classification Optimization of Skin Cancer Using the Adaboost Algorithm,” … J. Artif. Intell. Med. …, 2023.
M. H. D. M. Ribeiro, “Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series,” Appl. Soft Comput. J., vol. 86, 2020, doi: 10.1016/j.asoc.2019.105837.
H. Azis and S. R. Jabir, “Chemical Composition and Aroma Profiling: Decision Tree Modeling of Formalin Tofu,” J. Embed. Syst. Secur. …, 2023.
A. Nurul, Y. Salim, and H. Azis, “Analisis performa metode Gaussian Naïve Bayes untuk klasifikasi citra tulisan tangan karakter arab,” Indones. J. Data Sci., vol. 3, no. 3, pp. 115–121, 2022, doi: https://doi.org/10.56705/ijodas.v3i3.54.
H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020.
H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020.