Performance Analysis of the Decision Tree Classification Algorithm on the Pneumonia Dataset

  • Ahmad Naswin Universitas Megarezky
  • Adityo Permana Wibowo Universitas Teknologi Yogyakarta

Keywords: Decision Tree, Pneumonia, X-ray classification, Canny segmentation, Humoments, Cross-validation


The rapid advancements in machine learning have paved the way for innovative approaches in medical imaging diagnostics. In this context, this study explored the efficacy of the Decision Tree Classification Algorithm for distinguishing between normal and pneumonia-diagnosed X-ray images. We sourced our dataset from pediatric X-rays obtained from the Guangzhou Women and Children’s Medical Center. To enhance the classifier's performance, a methodical pre-processing strategy was adopted. This encompassed the application of the Canny segmentation technique, followed by feature extraction using humoments. The evaluation phase involved a 5-fold cross-validation, revealing a commendable average accuracy of 82.72%. These findings highlight not only the utility of Decision Trees in such specialized diagnostic tasks but also accentuate the pivotal role of systematic pre-processing in achieving optimal results. As medical diagnostics steadily move towards automation, this research provides valuable insights and benchmarks for future endeavors aiming to harness the power of machine learning in healthcare.


F. T. Admojo and Ahsanawati, “Klasifikasi Aroma Alkohol Menggunakan Metode KNN,” Indones. J. Data Sci., vol. 1, no. 2, pp. 34–38, 2020.

A. Z. Zami, O. Nurdiawan, and G. Dwilestari, “Klasifikasi Kondisi Gizi Bayi Bawah Lima Tahun Pada Posyandu Melati Dengan Menggunakan Algoritma Decision Tree,” vol. 3, pp. 305–310, 2022, doi: 10.30865/json.v3i3.3892.

A. Tangkelayuk and E. Mailoa, “Klasifikasi Kualitas Air Menggunakan Metode KNN , Naïve Bayes Dan Decision Tree,” vol. 9, no. 2, pp. 1109–1119, 2022.

F. T. Admojo and S. R. Jabir, “Analisis performa metode Naïve Bayesh Classifier pada Electronic Nose dalam identifikasi formalin pada tahu,” Indones. J. Data Sci., vol. 4, no. 1, pp. 1–16, 2023, doi: 10.56705/ijodas.v4i1.67.

A. Aisyah and S. Anraeni, “Analisis Penerapan Metode K-Nearest Neighbor (K-NN) pada Dataset Citra Penyakit Malaria,” Indones. J. Data Sci., vol. 3, no. 1, pp. 17–29, 2022, doi: 10.56705/ijodas.v3i1.22.

D. Cahyanti, A. Rahmayani, and S. Ainy, “Analisis performa metode Knn pada Dataset pasien pengidap Kanker Payudara,” Indones. J. Data Sci., vol. 1, no. 2, pp. 39–43, 2020.

A. Maulida, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020.

H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020, [Online]. Available: file:///Users/kbh/Downloads/507-2012-5-PB.pdf.

I. P. Putri, “Analisis Performa Metode K- Nearest Neighbor (KNN) dan Crossvalidation pada Data Penyakit Cardiovascular,” Indones. J. Data Sci., vol. 2, no. 1, pp. 21–28, 2021, doi: 10.33096/ijodas.v2i1.25.

L. Saiman and R. Satra, “Analisis performa metode Support Vector Machine untuk klasifikasi dataset aroma tahu berformalin,” Indones. J. Data Sci., vol. 2, no. 2, pp. 50–61, 2021, doi: 10.56705/ijodas.v2i2.28.

G. Xie, B. Guo, Z. Huang, Y. Zheng, and Y. Yan, “Combination of Dominant Color Descriptor and Hu Moments in Consistent Zone for Content Based Image Retrieval,” IEEE Access, vol. 8, pp. 146284–146299, 2020, doi: 10.1109/ACCESS.2020.3015285.

M. Radhakrishnan, A. Panneerselvam, and N. Nachimuthu, “Canny edge detection model in mri image segmentation using optimized parameter tuning method,” Intell. Autom. Soft Comput., vol. 26, no. 6, pp. 1185–1199, 2020, doi: 10.32604/iasc.2020.012069.

E. A. Sekehravani, E. Babulak, and M. Masoodi, “Implementing canny edge detection algorithm for noisy image,” Bull. Electr. Eng. Informatics, vol. 9, no. 4, pp. 1404–1410, 2020, doi: 10.11591/eei.v9i4.1837.

A. Mustopa, H. M. Nawawi, S. Agustiani, and S. K. Wildah, “Feature Extraction With Forest Classifer To Predicate Covid 19 Based On Thorax X-Ray Results,” Sistemasi, vol. 11, no. 2, p. 515, 2022, doi: 10.32520/stmsi.v11i2.1966.

R. Ridho, T. Informatika, F. Teknik, and U. M. Jakarta, “KLASIFIKASI DIAGNOSIS PENYAKIT COVID-19 MENGGUNAKAN METODE DECISION TREE,” vol. 11, no. 3, pp. 69–75, 2021.

H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Azis, Admojo, Susanti - 2020 - Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah.pdf.

M. M. Baharuddin, T. Hasanuddin, and H. Azis, “Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca,” Ilk. J. Ilm., vol. 11, no. 28, pp. 269–274, 2019, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Baharuddin, Hasanuddin, Azis - 2019 - Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca.pdf.

A. A. Karim, H. Azis, and Y. Salim, “Kinerja Metode C4.5 dalam Penyaluran Bantuan Dana Bencana 1,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 84–87, 2018, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Karim, Azis, Salim - 2018 - Kinerja Metode C4.5 dalam Penyaluran Bantuan Dana Bencana 1.pdf.

Ericha Apriliyani and Y. Salim, “Analisis performa metode klasifikasi Naïve Bayes Classifier pada Unbalanced Dataset,” Indones. J. Data Sci., vol. 3, no. 2, pp. 47–54, 2022, doi: 10.56705/ijodas.v3i2.45.

A. Nurul, Y. Salim, and H. Azis, “Analisis performa metode Gaussian Naïve Bayes untuk klasifikasi citra tulisan tangan karakter arab,” Indones. J. Data Sci., vol. 3, no. 3, pp. 115–121, 2022, doi:
