Comparison Analysis of Classification Model Performance in Lung Cancer Prediction Using Decision Tree, Naive Bayes, and Support Vector Machine

  • Dewi Widyawati Universitas Muslim Indonesia
  • Amaliah Faradibah Universitas Muslim Indonesia
  • Putri Lestari Lokapitasari Belluano Universitas Muslim Indonesia

Keywords: Decision Tree Classifier, Naïve Bayes Classifier, Support Vector Machine, Classification,, Perbandingan performa, Prediction

Abstract

This research aims to analyze the performance of three classification models, namely Decision Tree Classifier, Support Vector Machine, and Naive Bayes Classifier, in predicting lung cancer using the "Lung Cancer Prediction" dataset. The performance evaluation metrics used include accuracy, precision weighted, recall weighted, and F1 weighted. As a preliminary step, exploratory data analysis (EDA) and dataset preprocessing, including feature selection, data cleaning, and data transformation, were conducted. The test data results showed that the Decision Tree Classifier and Naive Bayes Classifier had similar performances with high accuracy, precision, recall, and F1 values. Meanwhile, the Support Vector Machine also exhibited competitive performance, although its precision weighted value was slightly lower. Additionally, an outlier analysis was conducted using box plots, revealing that the Decision Tree Classifier had 2 outlier values, while the Support Vector Machine had 4 outlier values, and Naive Bayes had no outlier values. In conclusion, all three classification models demonstrated good potential in lung cancer prediction. However, selecting the best model requires consideration of relevant evaluation metrics for the application and accommodating the limitations of each model. Further evaluation and in-depth analysis are needed to ensure the reliability of the models in predicting lung cancer cases more accurately and consistently.

Downloads

Download data is not yet available.

References

G. Sruthi, C. L. Ram, M. K. Sai, B. P. Singh, and ..., “Cancer prediction using machine learning,” … in Technology and …, 2022, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9754059/

Y. Zhou, C. Zhang, and S. Gao, “Breast cancer classification from histopathological images using resolution adaptive network,” IEEE Access, 2022, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9745527/

H. Hamdani, H. R. Hatta, N. Puspitasari, and ..., “Dengue classification method using support vector machines and cross-validation techniques,” … Journal of Artificial …, 2022, [Online]. Available: https://search.proquest.com/openview/a607c8361a7aac70dfc0dabf2b63f41b/1?pq-origsite=gscholar&cbl=1686339

A. Roy and S. Chakraborty, “Support vector machine in structural reliability analysis: A review,” Reliability Engineering &System Safety, 2023, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0951832023000418

H. Hafdaoui, A. Chahtou, S. Bouchakour, and ..., “Analyzing the performance of photovoltaic systems using support vector machine classifier,” … Energy, Grids and …, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2352467721001508

T. A. Mutiara and Q. N. Azizah, “Klasifikasi Tumor Otak Menggunakan Ekstraksi Fitur HOG dan Support Vector Machine,” Jurnal Infortech, 2022, [Online]. Available: https://ejournal.bsi.ac.id/ejurnal/index.php/infortech/article/view/12813

H. N. Mahendra and ..., “An efficient classification of hyperspectral remotely sensed data using support vector machine,” International Journal of …, 2022, [Online]. Available: https://yadda.icm.edu.pl/baztech/element/bwmeta1.element.baztech-17005b9c-52b2-4dc3-9618-036c7b97d6f9

F. A. SATRIA, A. Abdiansah, and A. S. Utami, DETEKSI DOMAIN TIDAK RELEVAN (OUT-OF-DOMAIN) PADA CHATBOT BERBAHASA INDONESIA MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE. repository.unsri.ac.id, 2022. [Online]. Available: https://repository.unsri.ac.id/72916/

W. Sun and J. Zhang, “A novel carbon price prediction model based on optimized least square support vector machine combining characteristic-scale decomposition and phase space …,” Energy, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0360544222010702

T. Adugna, W. Xu, and J. Fan, “Comparison of random forest and support vector machine classifiers for regional land cover mapping using coarse resolution FY-3C images,” Remote Sens (Basel), 2022, [Online]. Available: https://www.mdpi.com/2072-4292/14/3/574

A. Fatihin, D. Khairani, S. U. U. Masruroh, and ..., “Public Sentiment on User Reviews about Application in Handling COVID-19 using Naive Bayes Method and Support Vector Machine,” … on Science and …, 2022, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9829068/

B. Imran, H. Hambali, A. Subki, and ..., “Data Mining Using Random Forest, Naïve Bayes, and Adaboost Models for Prediction and Classification of Benign and Malignant Breast Cancer,” Jurnal Pilar Nusa …, 2022, [Online]. Available: http://ejournal.nusamandiri.ac.id/index.php/pilar/article/view/2912

N. Deepa, J. S. Priya, and T. Devi, “Towards applying internet of things and machine learning for the risk prediction of COVID-19 in pandemic situation using Naive Bayes classifier for improving …,” Mater Today Proc, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2214785322016868

A. Ainurrohmah and D. T. Wiyanti, “Analisis Performa Algoritma Decision Tree, Naive Bayes, K-Nearest Neighbor untuk Klasifikasi Zona Daerah Risiko Covid-19 di Indonesia,” Jurnal Teknologi Informasi dan Ilmu …, 2023, [Online]. Available: http://jtiik.ub.ac.id/index.php/jtiik/article/view/5935

N. Attamami, A. Triayudi, and ..., “Analisis Performa Algoritma Klasifikasi Naive Bayes Dan C4. 5 Untuk Prediksi Penerima Bantuan Jaminan Kesehatan,” Jurnal Jtik (Jurnal …, 2023, [Online]. Available: http://journal.lembagakita.org/index.php/jtik/article/view/756

T. NUGRAHA, ANALISIS SENTIMEN RESPONS MASYARAKAT TERHADAP KARTU PRAKERJA MENGGUNAKAN ALGORITMA K-NN, NAÏVE BAYES DAN SVM. repository.mercubuana.ac.id, 2022. [Online]. Available: https://repository.mercubuana.ac.id/70599/

M. Kiguchi, W. Saeed, and I. Medi, “Churn prediction in digital game-based learning using data mining techniques: Logistic regression, decision tree, and random forest,” Appl Soft Comput, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494622000436

W. Gao et al., “Prediction of acute kidney injury in ICU with gradient boosting decision tree algorithms,” Computers in biology and …, 2022, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S001048252100891X

R. Guo, D. Fu, and G. Sollazzo, “An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree,” International Journal of Pavement …, 2022, doi: 10.1080/10298436.2021.1910825.

L. M. SOTARJUA and D. B. SANTOSO, “PERBANDINGAN ALGORITMA KNN, DECISION TREE,* DAN RANDOM* FOREST PADA DATA IMBALANCED CLASS UNTUK KLASIFIKASI PROMOSI

Published
2023-07-31
How to Cite
Dewi Widyawati, Amaliah Faradibah, & Lestari Lokapitasari Belluano, P. (2023). Comparison Analysis of Classification Model Performance in Lung Cancer Prediction Using Decision Tree, Naive Bayes, and Support Vector Machine. Indonesian Journal of Data and Science, 4(2), 78-86. https://doi.org/10.56705/ijodas.v4i2.76