Comparison of Classification Algorithm Performance for Diabetes Prediction Using Orange Data Mining

  • Hafiz Aryan Siregar Universitas Islam Negeri Sultan Syarif Kasim Riau
  • Muhammad Zacky Raditya Universitas Islam Negeri Sultan Syarif Kasim Riau
  • Aditya Nugraha Yesa Universitas Islam Negeri Sultan Syarif Kasim Riau
  • Inggih Permana Universitas Islam Negeri Sultan Syarif Kasim Riau

Keywords: Data mining, Diabetes, KNN, Naive Bayes, Random Forest, Classification

Abstract

Diabetes is a disease that contributes to a relatively high mortality rate. The human death rate due to diabetes is a widespread issue globally. The primary goal of this research is to predict individuals suffering from diabetes using a publicly available dataset from the UCI Repository with the Diabetes Disease dataset. To obtain the best classification algorithm, a comparison is made among three algorithms: KNN, Naive Bayes, and Random Forest, commonly used for predicting diabetes. The comparison results indicate that the Random Forest algorithm is the appropriate and accurate algorithm for predicting individuals with diabetes, with an accuracy rate of 97%.

Downloads

Download data is not yet available.

References

Longmore, D. K., Barr, E. L., Wilson, A. N., Barzi, F., Kirkwood, M., Simmonds, A., ... & Maple-Brown, L. J. (2020). "Associations of gestational diabetes and type 2 diabetes during pregnancy with breastfeeding at hospital discharge and up to 6 months: the PANDORA study." Diabetologia, 63, 2571-2581.

A. R. P. Abimanyu et al., "Pengaruh Terapi Pada Penderita Diabetes Mellitus Sebagai Penurunan Ka dar Gula Darah: Review Artikel," Innovative: Journal Of Social Science Research, vol. 3, no. 2, pp. 8931-8949, 2023.

Maryati, Y., Alifiar, I., Nurfatwa, M., Nofianti, T., & Rahayuningsih, N. (2019, July). "Antlion (Myrmeleon sp.) Infusion as Antidiabetic in Dexamethasone Induced Mice." In Journal of Physics: Conference Series, vol. 1179, No. 1, p. 012177. IOP Publishing.

M. Ridwan, H. Suyono, dan M. Sarosa, "Penerapan Data Mining Untuk Evaluasi Kinerja Akademik Mahasiswa Menggunakan Algoritma Naive Bayes Classifier," Jurnal EECCIS (Electrics, Electronics, Communications, Controls, Informatics, Systems), vol. 7, no. 1, pp. 59-64, 2013.

D. Cahyanti, A. Rahmayani, dan S. A. Husniar, "Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara," Indonesian Journal of Data and Science, vol. 1, no. 2, pp. 39-43, 2020.

M. Fithratullah, "Representation of Korean Values Sustainability in American Remake Movies," Teknosastik, vol. 19, no. 1, p. 60, 2021. [Online]. Available: [https://doi.org/10.33365/ts.v19i1.874]

A. Wantoro, A. Syarif, K. N. Berawi, K. Muludi, S. R. Sulistiyanti, U. Lampung, I. Komputer, U. Lampung, K. Masyarakat, F. Kedokteran, U. Lampung, T. Elektro, F. Teknik, U. Lampung, U. Lampung, G. Meneng, dan B. Lampung, "Metode Profile Matching Pada Sistem Pakar Medis Untuk," vol. 15, no. 2, pp. 134–145, 2021.

F. Dharma, A. Noviana, M. Tahir, dan N. Hendrastuty, "Prediction of Indonesian Inflation Rate Using Regression Model Based on Genetic Algorithms," J. Informatics Optim. Nanotechnol. Mater., vol. 5, no. 1, pp. 45–52, 2020. [Online]. Available:[https://doi.org/10.15575/join]

E. D. Listiono, A. Surahman, dan S. Sintaro, "Ensiklopedia Istilah Geografi Menggunakan Metode Sequential Search Berbasis Android Studi Kasus: Sma Teladan Way Jepara Lampung Timur," Jurnal Teknologi Dan Sistem Informasi, vol. 2, no. 1, pp. 35–42, 2021

H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” ILKOM Jurnal Ilmiah, vol. 12, no. 2, pp. 81–86, 2020, [Online]. Available: file:///Users/kbh/Downloads/507-2012-5-PB.pdf

D. Mahapatra, “Handwritten Character Recognition Using KNN and SVM Based Classifier over Feature Vector from Autoencoder,” Communications in Computer and Information Science, vol. 1240, pp. 304–317, 2020, doi: 10.1007/978-981-15-6315-7_25

H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Azis, Admojo, Susanti - 2020 - Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah.pdf

Y. Jusman, “Machine Learnings of Dental Caries Images based on Hu Moment Invariants Features,” Proceedings - 2021 International Seminar on Application for Technology of Information and Communication: IT Opportunities and Creativities for Digital Innovation and Communication within Global Pandemic, iSemantic 2021, pp. 296–299, 2021, doi: 10.1109/iSemantic52711.2021.9573208

Y. Jusman, “Classification System for Leukemia Cell Images based on Hu Moment Invariants and Support Vector Machines,” Proceedings - 2021 11th IEEE International Conference on Control System, Computing and Engineering, ICCSCE 2021, pp. 137–141, 2021, doi: 10.1109/ICCSCE52189.2021.9530974

X. Ye, “Prediction of Breast Cancer of Women Based on Support Vector Machines,” ACM International Conference Proceeding Series, pp. 780–784, 2020, doi: 10.1145/3443467.3443853

Published
2023-12-31
How to Cite
Hafiz Aryan Siregar, Muhammad Zacky Raditya, Aditya Nugraha Yesa, & Inggih Permana. (2023). Comparison of Classification Algorithm Performance for Diabetes Prediction Using Orange Data Mining. Indonesian Journal of Data and Science, 4(3), 176-182. https://doi.org/10.56705/ijodas.v4i3.103