Performance Analysis of the Decision Tree Classification Algorithm on the Water Quality and Potability Dataset
DOI:
https://doi.org/10.56705/ijodas.v4i3.113Keywords:
Decision Tree, Water Quality, Potability, Machine Learning, Cross-validation, Environmental ScienceAbstract
Ensuring water potability is paramount for public health and safety. This research aimed to assess the efficacy of the Decision Tree classification algorithm in predicting water potability using the Water Quality and Potability dataset. Employing a 5-fold cross-validation technique, the model showcased a moderate performance with an average accuracy of approximately 54.33%. While the Decision Tree provides a baseline and interpretable mechanism for classification, the results emphasize the need for further exploration using more intricate models or ensemble methods. This study contributes to the broader effort of leveraging machine learning techniques for water quality assessment and provides insights into the potential and limitations of such models in predicting water safety
Downloads
References
A. Tangkelayuk and E. Mailoa, “Klasifikasi Kualitas Air Menggunakan Metode KNN , Naïve Bayes Dan Decision Tree,” vol. 9, no. 2, pp. 1109–1119, 2022.
A. Maulida, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020.
Ericha Apriliyani and Y. Salim, “Analisis performa metode klasifikasi Naïve Bayes Classifier pada Unbalanced Dataset,” Indones. J. Data Sci., vol. 3, no. 2, pp. 47–54, 2022, doi: 10.56705/ijodas.v3i2.45.
R. Ridho, T. Informatika, F. Teknik, and U. M. Jakarta, “KLASIFIKASI DIAGNOSIS PENYAKIT COVID-19 MENGGUNAKAN METODE DECISION TREE,” vol. 11, no. 3, pp. 69–75, 2021.
H. Azis, “Analisis Performa Metode Support Vector Regression ( SVR ) dalam Memprediksi Harga Bahan Sembako Nasional,” Indones. J. Data Sci., vol. xx, no. 200, 2021.
A. Z. Zami, O. Nurdiawan, and G. Dwilestari, “Klasifikasi Kondisi Gizi Bayi Bawah Lima Tahun Pada Posyandu Melati Dengan Menggunakan Algoritma Decision Tree,” vol. 3, pp. 305–310, 2022, doi: 10.30865/json.v3i3.3892.
D. Cahyanti, A. Rahmayani, and S. Ainy, “Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara,” Indones. J. Data Sci., vol. 1, no. 2, pp. 39–43, 2020.
F. T. Admojo and Ahsanawati, “Klasifikasi Aroma Alkohol Menggunakan Metode KNN,” Indones. J. Data Sci., vol. 1, no. 2, pp. 34–38, 2020.
M. Kiguchi, W. Saeed, and I. Medi, “Churn prediction in digital game-based learning using data mining techniques: Logistic regression, decision tree, and random forest,” Appl Soft Comput, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494622000436
W. Gao et al., “Prediction of acute kidney injury in ICU with gradient boosting decision tree algorithms,” Computers in biology and …, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S001048252100891X
R. Guo, D. Fu, and G. Sollazzo, “An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree,” International Journal of Pavement …, 2022, doi: 10.1080/10298436.2021.1910825.
L. M. Sotarjua And D. B. Santoso, “Perbandingan Algoritma Knn, Decision Tree,* Dan Random* Forest Pada Data Imbalanced Class Untuk Klasifikasi Promosi Karyawan,” … Informatika Sains dan …, 2022, [Online]. Available: https://journal3.uin-alauddin.ac.id/index.php/instek/article/view/31385
M. H. Setiono, “A Komparasi Algoritma Decision Tree, Random Forest, Svm Dan K-Nn Dalam Klasifikasi Kepuasan Penumpang Maskapai Penerbangan,” Inti Nusa Mandiri, 2022, [Online]. Available: https://ejournal.nusamandiri.ac.id/index.php/inti/article/view/3420.
F. Tangguh and Y. Islami, “Analisis performa algoritma Stochastic Gradient Descent ( SGD ) dalam mengklasifikasi tahu berformalin,” Indones. J. Data Sci., vol. 3, no. 1, pp. 1–8, 2022, doi: 10.56705/ijodas.v3i1.42.
L. Britanthia, C. Tanujaya, B. Susanto, and A. Saragih, “Perbandingan Metode Regresi Logistik dan Random Forest untuk Klasifikasi Fitur Mode Audio Spotify,” Indones. J. Data Sci., vol. 1, no. 3, pp. 68–78, 2020.
I. P. Putri, “Analisis Performa Metode K- Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., vol. 2, no. 1, pp. 21–28, 2021, doi: 10.33096/ijodas.v2i1.25.
D. Pradana, M. Luthfi Alghifari, M. Farhan Juna, and D. Palaguna, “Klasifikasi Penyakit Jantung Menggunakan Metode Artificial Neural Network,” Indones. J. Data Sci., vol. 3, no. 2, pp. 55–60, 2022, doi: 10.56705/ijodas.v3i2.35.
Downloads
Published
Issue
Section
License
Authors retain copyright and full publishing rights to their articles. Upon acceptance, authors grant Indonesian Journal of Data and Science a non-exclusive license to publish the work and to identify itself as the original publisher.
Self-archiving. Authors may deposit the submitted version, accepted manuscript, and version of record in institutional or subject repositories, with citation to the published article and a link to the version of record on the journal website.
Commercial permissions. Uses intended for commercial advantage or monetary compensation are not permitted under CC BY-NC 4.0. For permissions, contact the editorial office at ijodas.journal@gmail.com.
Legacy notice. Some earlier PDFs may display “Copyright © [Journal Name]” or only a CC BY-NC logo without the full license text. To ensure clarity, the authors maintain copyright, and all articles are distributed under CC BY-NC 4.0. Where any discrepancy exists, this policy and the article landing-page license statement prevail.










