Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes
Abstract
Diabetes adalah penyakit yang berlangsung lama atau kronis serta ditandai dengan kadar gula (glukosa) darah yang tinggi atau di atas nilai normal. Jika diabetes tidak dikontrol dengan baik, Pengujian performa berbagai metode pada sebuah dataset merupakan salah satu cara dalam penetapan metode klasifikasi yang tepat, masalah yang diangkat pada penelitian ini adalah bagaimana mengukur performa metode klasifikasi dalam mengelola dataset penderita diabetes. Metode yang digunakan yaitu algoritma K-Nearest Neighbor (KNN), dimana merupakan sebuah metode untuk melakukan klasifikasi terhadap objek berdasarkan data pembelajaran yang jaraknya paling dekat dengan objek tersebut. Pada hasil akhir penelitian ini, telah dihitung akurasi tertinggi 39% pada K=3, presisi tertinggi 65% pada K=3 dan K=5, recall tertinggi 36% pada K=3, dan F-Measure tertinggi 46% pada K=3.
Downloads
References
N. Fadhillah, H. Azis, and D. Lantara, “Validasi Pencarian Kata Kunci Menggunakan Algoritma Levenshtein Distance Berdasarkan Metode Approximate String Matching,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 3–7, 2018.
Hasran, “Klasifikasi Penyakit Jantung Menggunakan Metode K-Nearest Neighbor,” Indones. J. Data Sci., vol. 1, no. 1, pp. 1–4, 2020.
M. M. Baharuddin, T. Hasanuddin, and H. Azis, “Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca,” Ilk. J. Ilm., vol. 11, no. 28, pp. 269–274, 2019.
A. Ilham, “Komparasi Algoritma Klasifikasi Dengan Pendekatan Level Data Untuk Menangani Data Kelas Tidak Seimbang,” J. Ilm. Ilmu Komput., vol. 3, no. 1, pp. 9–14, 2017.
M. Yusa, E. Utami, and E. T. Luthfi, “Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes,” J. Buana Inform., vol. 7, no. 4, pp. 293–302, 2016, doi: 10.24002/jbi.v7i4.770.
Rizky Ade Putranto, Triastiti Wuryandari, and Sudarno, “Perbandingan Analisis Klasifikasi Antara Decision Tree Dan Support Vector Machine Multiclass Untuk Penentuan Jurusan Pada Siswa Sma,” J. Gaussian, vol. 4, no. 4, pp. 1007–1016, 2015.
Y. Lukito and A. R. Chrismanto, “Perbandingan Metode-Metode Klasifikasi untuk Indoor Positioning System,” J. Tek. Inform. dan Sist. Inf., vol. 1, no. 2, pp. 123–131, 2015, doi: 10.28932/jutisi.v1i2.373.
S. Niu, J. Yang, S. Wang, and G. Chen, “Improvement and parallel implementation of canny edge detection algorithm based on GPU,” Proc. Int. Conf. ASIC, no. 6, pp. 641–644, 2011, doi: 10.1109/ASICON.2011.6157287.
W. Ye, Y. Xia, and Q. Wang, “An Improved Canny Algorithm for Edge Detection,” J. Comput. Inf. Syst., vol. 75, pp. 1516–1523, 2011, doi: 10.1109/WCSE.2009.718.
T. F. Wu, C. J. Lin, and R. C. Weng, “Probability estimates for multi-class classification by pairwise coupling,” J. Mach. Learn. Res., vol. 5, pp. 975–1005, 2004.
K. Crammer, “On the algorithmic implementation of multiclass kernel-based vector machines,” J. Mach. Learn. Res. - JMLR, vol. 2, no. 2, pp. 265–292, 2002.
M. J. Hartmann and G. Carleo, “Neural-Network Approach to Dissipative Quantum Many-Body Dynamics,” Phys. Rev. Lett., vol. 122, no. 25, p. 250502, Jun. 2019, doi: 10.1103/PhysRevLett.122.250502.
B. Gao and L. Pavel, “On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning,” 2017.
H. Zhang, “The optimality of Naive Bayes,” Proc. Seventeenth Int. Florida Artif. Intell. Res. Soc. Conf. FLAIRS 2004, vol. 2, pp. 562–567, 2004.
V. Metsis, I. Androutsopoulos, and G. Paliouras, “Spam filtering with Naive Bayes - Which Naive Bayes?,” 3rd Conf. Email Anti-Spam - Proceedings, CEAS 2006, 2006.
M. Christopher, P. Raghavan, and H. Schutze, An Introduction to Information Retrieval. Cambridge University Press, 2009.
Y. L. Pavlov, “Random forests,” Random For., pp. 1–122, 2019, doi: 10.1201/9780367816377-11.
T. Hastie, S. Rosset, J. Zhu, and H. Zou, “Multi-class AdaBoost,” Stat. Interface, vol. 2, no. 3, pp. 349–360, 2009, doi: 10.4310/sii.2009.v2.n3.a8.
R. Puri and K. Khamrui, “Application of Quantitative Descriptive Analysis (QDA), Principal Component Analysis (PCA) and Response Surface Methodology (RSM) in standardization of cham-cham making.,” 2015.
A. Tharwat, “Linear vs. quadratic discriminant analysis classifier: a tutorial,” Int. J. Appl. Pattern Recognit., vol. 3, no. 2, p. 145, 2016, doi: 10.1504/ijapr.2016.079050.
A. Tharwat, “Classification assessment methods,” Appl. Comput. Informatics, 2018, doi: 10.1016/j.aci.2018.08.003.
P. A. Flach and M. Kull, “Precision-Recall-Gain curves: PR analysis done right,” Adv. Neural Inf. Process. Syst., vol. 2015-Janua, pp. 838–846, 2015.
L. Nurhayati and H. Azis, “Perancangan Sistem Pendukung Keputusan Untuk Proses Kenaikan Jabatan Struktural Pada Biro Kepegawaian,” Semin. Nas. Teknol. Inf. dan Multimed., pp. 6–7, 2016.
J. D. Kelleher, B. Mac Namee, and A. D. Arcy, Fundamentals of Machine Learning For Predictive Data Analytics Algorithms, Worked Examples, and Case Studies. London: The MIT Press, 2015.
K. H. Brodersen, C. S. Ong, K. E. Stephan, and J. M. Buhmann, “The balanced accuracy and its posterior distribution,” Proc. - Int. Conf. Pattern Recognit., pp. 3121–3124, 2010, doi: 10.1109/ICPR.2010.764.
A. A. Karim, H. Azis, and Y. Salim, “Kinerja Metode C4.5 dalam Penyaluran Bantuan Dana Bencana 1,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 84–87, 2018.
A. Fitria and H. Azis, “Analisis Kinerja Sistem Klasifikasi Skripsi menggunakan Metode Naïve Bayes Classifier,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 102–106, 2018.
Copyright (c) 2020 Indonesian Journal of Data and Science
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- The work is not under consideration for publication elsewhere.
- The work has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with Indonesian Journal of Data and Science agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.