Advancing Healthcare Diagnostics: A Study on Gaussian Naive Bayes Classification of Blood Samples

  • Ihwana As'ad Universitas Muslim Indonesia

Keywords: Gaussian Naive Bayes, Machine Learning, Health Prediction, Blood Samples, Medical Diagnostics, Biomedical Informatics


This research paper presents a comprehensive analysis of the Gaussian Naive Bayes (GNB) classifier's application in predicting health conditions from blood samples, underpinned by a handcrafted dataset representative of typical physiological ranges. Through a meticulous 5-fold cross-validation approach, the study assesses the GNB model's performance in terms of accuracy, precision, recall, and F1-score, revealing not only high efficacy but also consistent improvement in predictive capability across successive folds. A detailed confusion matrix provides further insights into the model's classification proficiency. The results affirmatively address the research hypotheses, indicating the GNB classifier's reliability and effectiveness as a diagnostic tool. With the increasing need for rapid and accurate medical diagnostics, the study's findings underscore the potential of even simple machine learning models to augment traditional blood test analyses, thereby offering significant contributions to the field of biomedical informatics. The research lays the groundwork for future explorations into the integration of machine learning in clinical settings, advocating for the verification of these promising results with real-world clinical data and the comparative analysis of various machine learning models. The potential for automated, precise diagnostic processes paves the way for enhanced patient care and resource optimization in healthcare.


Z. H. Zhou, Machine Learning. 2021.

Herman et al., “Comparison of Artificial Neural Network and Gaussian Naïve Bayes in Recognition of Hand-Writing Number,” in 2018 2nd East Indonesia Conference on Computer and Information Technology (EIConCIT), Nov. 2018, pp. 276–279, doi: 10.1109/EIConCIT.2018.8878651.

N. Rismayanti, A. Naswin, U. Zaky, M. Zakariyah, and D. A. Purnamasari, “Evaluating Thresholding-Based Segmentation and Humoment Feature Extraction in Acute Lymphoblastic Leukemia Classification using Gaussian Naive Bayes,” Int. J. Artif. Intell. Med. Issues, vol. 1, no. 2, 2023, doi:

N. A’ayunnisa, Y. Salim, and H. Azis, “Analisis performa metode Gaussian Naïve Bayes untuk klasifikasi citra tulisan tangan karakter arab,” … J. Data Sci., 2022, [Online]. Available:

S. Rahman, “Performance analysis of boosting classifiers in recognizing activities of daily living,” Int. J. Environ. Res. Public Health, vol. 17, no. 3, 2020, doi: 10.3390/ijerph17031082.

P. Sharma, “Performance analysis of deep learning CNN models for disease detection in plants using image segmentation,” Inf. Process. Agric., vol. 7, no. 4, pp. 566–574, 2020, doi: 10.1016/j.inpa.2019.11.001.

H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020, doi:

H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020 doi:

A. Nurul, Y. Salim, and H. Azis, “Analisis performa metode Gaussian Naïve Bayes untuk klasifikasi citra tulisan tangan karakter arab,” Indones. J. Data Sci., vol. 3, no. 3, pp. 115–121, 2022, doi:

A. Fitria and H. Azis, “Analisis Kinerja Sistem Klasifikasi Skripsi menggunakan Metode Naïve Bayes Classifier,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 102–106, 2018.

A. A. D. Halim and S. Anraeni, “Analisis Klasifikasi Dataset Citra Penyakit Pneumonia menggunakan Metode K-Nearest Neighbor (KNN),” Indones. J. Data Sci., vol. 2, no. 1, pp. 1–12, 2021, doi: 10.33096/ijodas.v2i1.23.

I. P. Putri, “Analisis Performa Metode K-Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., 2021, doi:

A. Aisyah and S. Anraeni, “Analisis penerapan metode K-Nearest Neighbor (K-NN) pada dataset citra penyakit malaria,” Indones. J. Data Sci., 2022,

J. Zhao, K. S. Chong, W. Shu, and ..., “A Data Pre-Processing Module for Improved-Accuracy Machine-Learning-based Micro-Single-Event-Latchup Detection,” 2023 IEEE 9th Int. …, 2023,

A. Tuppad and S. D. Patil, “Data Pre-processing Issues in Medical Data Classification,” 2023 Int. Conf. …, 2023, doi:

K. M. Bain, “Cross-validation of three Advanced Clinical Solutions performance validity tests: Examining combinations of measures to maximize classification of invalid performance,” Appl. Neuropsychol., vol. 28, no. 1, pp. 24–34, 2021, doi: 10.1080/23279095.2019.1585352.

A. M. Argina, “Application of the K-Nearest Neighbor Classification Method on a Dataset of Diabetes Patients,” Indones. J. Data Sci., 2020.

F. T. Admojo and Ahsanawati, “Klasifikasi Aroma Alkohol Menggunakan Metode KNN,” Indones. J. Data Sci., vol. 1, no. 2, pp. 34–38, 2020.

D. Pradana, M. Luthfi Alghifari, M. Farhan Juna, and D. Palaguna, “Klasifikasi Penyakit Jantung Menggunakan Metode Artificial Neural Network,” Indones. J. Data Sci., vol. 3, no. 2, pp. 55–60, 2022, doi: 10.56705/ijodas.v3i2.35.

Ericha Apriliyani and Y. Salim, “Analisis performa metode klasifikasi Naïve Bayes Classifier pada Unbalanced Dataset,” Indones. J. Data Sci., vol. 3, no. 2, pp. 47–54, 2022, doi: 10.56705/ijodas.v3i2.45.

D. Cahyanti, A. Rahmayani, and ..., “Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara,” Indones. J. …, 2020,doi:

S. Naiem, “Enhancing the Efficiency of Gaussian Naïve Bayes Machine Learning Classifier in the Detection of DDOS in Cloud Computing,” IEEE Access, vol. 11, pp. 124597–124608, 2023, doi: 10.1109/ACCESS.2023.3328951.

I. Sulistiani, “Breast Cancer Prediction Using Random Forest and Gaussian Naïve Bayes Algorithms,” 2022 1st Int. Conf. Inf. Syst. Inf. Technol. ICISIT 2022, pp. 170–175, 2022, doi: 10.1109/ICISIT54091.2022.9872808.

A. Krysovatyy, “Classification Method of Fictitious Enterprises Based on Gaussian Naive Bayes,” Int. Sci. Tech. Conf. Comput. Sci. Inf. Technol., vol. 2, pp. 224–227, 2021, doi: 10.1109/CSIT52700.2021.9648584.

O. Karal, “Performance comparison of different kernel functions in SVM for different k value in k-fold cross-validation,” Proc. - 2020 Innov. Intell. Syst. Appl. Conf. ASYU 2020, 2020, doi: 10.1109/ASYU50717.2020.9259880.

Z. Xiong, “Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation,” Comput. Mater. Sci., vol. 171, 2020, doi: 10.1016/j.commatsci.2019.109203.

A. Das, “Assessment of peri-urban wetland ecological degradation through importance-performance analysis (IPA): A study on Chatra Wetland, India,” Ecol. Indic., vol. 114, 2020, doi: 10.1016/j.ecolind.2020.106274.

K. Nidhul, “Enhanced thermo-hydraulic performance in a V-ribbed triangular duct solar air heater: CFD and exergy analysis,” Energy, vol. 200, 2020, doi: 10.1016/

D. İzci, “Comparative performance analysis of slime mould algorithm for efficient design of proportional–integral–derivative controller,” Electrica, vol. 21, no. 1, pp. 151–159, 2021, doi: 10.5152/ELECTRICA.2021.20077.

A. A. Ewees, “Performance analysis of Chaotic Multi-Verse Harris Hawks Optimization : A case study on solving engineering problems,” Eng. Appl. Artif. Intell., vol. 88, 2020, doi: 10.1016/j.engappai.2019.103370.

J. P. Li, A. U. Haq, S. U. Din, J. Khan, A. Khan, and A. Saboor, “Heart Disease Identification Method Using Machine Learning Classification in E-Healthcare,” IEEE Access, vol. 8, pp. 107562–107582, 2020, doi: 10.1109/ACCESS.2020.3001149.