Performance Analysis of the Decision Tree Classification Algorithm on the Water Quality and Potability Dataset
Abstract
Ensuring water potability is paramount for public health and safety. This research aimed to assess the efficacy of the Decision Tree classification algorithm in predicting water potability using the Water Quality and Potability dataset. Employing a 5-fold cross-validation technique, the model showcased a moderate performance with an average accuracy of approximately 54.33%. While the Decision Tree provides a baseline and interpretable mechanism for classification, the results emphasize the need for further exploration using more intricate models or ensemble methods. This study contributes to the broader effort of leveraging machine learning techniques for water quality assessment and provides insights into the potential and limitations of such models in predicting water safety
Downloads
References
A. Tangkelayuk and E. Mailoa, “Klasifikasi Kualitas Air Menggunakan Metode KNN , Naïve Bayes Dan Decision Tree,” vol. 9, no. 2, pp. 1109–1119, 2022.
A. Maulida, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020.
Ericha Apriliyani and Y. Salim, “Analisis performa metode klasifikasi Naïve Bayes Classifier pada Unbalanced Dataset,” Indones. J. Data Sci., vol. 3, no. 2, pp. 47–54, 2022, doi: 10.56705/ijodas.v3i2.45.
R. Ridho, T. Informatika, F. Teknik, and U. M. Jakarta, “KLASIFIKASI DIAGNOSIS PENYAKIT COVID-19 MENGGUNAKAN METODE DECISION TREE,” vol. 11, no. 3, pp. 69–75, 2021.
H. Azis, “Analisis Performa Metode Support Vector Regression ( SVR ) dalam Memprediksi Harga Bahan Sembako Nasional,” Indones. J. Data Sci., vol. xx, no. 200, 2021.
A. Z. Zami, O. Nurdiawan, and G. Dwilestari, “Klasifikasi Kondisi Gizi Bayi Bawah Lima Tahun Pada Posyandu Melati Dengan Menggunakan Algoritma Decision Tree,” vol. 3, pp. 305–310, 2022, doi: 10.30865/json.v3i3.3892.
D. Cahyanti, A. Rahmayani, and S. Ainy, “Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara,” Indones. J. Data Sci., vol. 1, no. 2, pp. 39–43, 2020.
F. T. Admojo and Ahsanawati, “Klasifikasi Aroma Alkohol Menggunakan Metode KNN,” Indones. J. Data Sci., vol. 1, no. 2, pp. 34–38, 2020.
M. Kiguchi, W. Saeed, and I. Medi, “Churn prediction in digital game-based learning using data mining techniques: Logistic regression, decision tree, and random forest,” Appl Soft Comput, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494622000436
W. Gao et al., “Prediction of acute kidney injury in ICU with gradient boosting decision tree algorithms,” Computers in biology and …, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S001048252100891X
R. Guo, D. Fu, and G. Sollazzo, “An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree,” International Journal of Pavement …, 2022, doi: 10.1080/10298436.2021.1910825.
L. M. Sotarjua And D. B. Santoso, “Perbandingan Algoritma Knn, Decision Tree,* Dan Random* Forest Pada Data Imbalanced Class Untuk Klasifikasi Promosi Karyawan,” … Informatika Sains dan …, 2022, [Online]. Available: https://journal3.uin-alauddin.ac.id/index.php/instek/article/view/31385
M. H. Setiono, “A Komparasi Algoritma Decision Tree, Random Forest, Svm Dan K-Nn Dalam Klasifikasi Kepuasan Penumpang Maskapai Penerbangan,” Inti Nusa Mandiri, 2022, [Online]. Available: https://ejournal.nusamandiri.ac.id/index.php/inti/article/view/3420.
F. Tangguh and Y. Islami, “Analisis performa algoritma Stochastic Gradient Descent ( SGD ) dalam mengklasifikasi tahu berformalin,” Indones. J. Data Sci., vol. 3, no. 1, pp. 1–8, 2022, doi: 10.56705/ijodas.v3i1.42.
L. Britanthia, C. Tanujaya, B. Susanto, and A. Saragih, “Perbandingan Metode Regresi Logistik dan Random Forest untuk Klasifikasi Fitur Mode Audio Spotify,” Indones. J. Data Sci., vol. 1, no. 3, pp. 68–78, 2020.
I. P. Putri, “Analisis Performa Metode K- Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., vol. 2, no. 1, pp. 21–28, 2021, doi: 10.33096/ijodas.v2i1.25.
D. Pradana, M. Luthfi Alghifari, M. Farhan Juna, and D. Palaguna, “Klasifikasi Penyakit Jantung Menggunakan Metode Artificial Neural Network,” Indones. J. Data Sci., vol. 3, no. 2, pp. 55–60, 2022, doi: 10.56705/ijodas.v3i2.35.

Copyright (c) 2024 Indonesian Journal of Data and Science

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
License and Copyright Agreement
By submitting a manuscript to the Indonesian Journal of Data and Science (IJODAS), the author(s) confirm and agree to the following:
- All co-authors have given their consent to enter into this agreement.
- The submitted manuscript has not been formally published elsewhere, except as an abstract, thesis, or in the context of a lecture, review, or overlay journal.
- The manuscript is not currently under review or consideration by another journal or publisher.
- All authors have approved the manuscript and its submission to IJODAS, and where applicable, have received institutional approval (tacit or explicit) from affiliated organizations.
- The authors have secured appropriate permissions to reproduce any third-party material included in the manuscript that may be under copyright.
- The authors agree to abide by the licensing and copyright terms outlined below.
Copyright Policy
Authors who publish in IJODAS retain the copyright to their work and grant the journal the right of first publication. The published work is simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0) , which permits others to share and adapt the work for non-commercial purposes, with proper attribution to the authors and the initial publication in this journal.
Reuse and Distribution
- Authors may enter into separate, additional contractual arrangements for non-exclusive distribution of the journal-published version of the article (e.g., institutional repositories, book chapters), provided there is proper acknowledgment of its initial publication in IJODAS.
- Prior to and during the submission process, we encourage authors to archive preprints and accepted versions of their work on personal websites or institutional repositories. This method supports scholarly communication, visibility, and early citation.
For more details on the terms of the Creative Commons license used by IJODAS, please visit the official license page.