Automated Classification of COVID-19 Chest X-ray Images Using Ensemble Machine Learning Methods

  • A. Sinra UPTP. Wil. Makassar 1 Badan Pendapatan Daerah Prov. Sulsel
  • Husni Angriani STMIK KHARISMA Makassar

Keywords: COVID-19, Chest X-ray, Random Forest Classifier, Ensemble Methods, Image Classification, Diagnostic Accuracy

Abstract

This study delves into the efficacy of ensemble machine learning techniques for classifying chest X-ray images into three distinct categories: Normal, COVID-19, and Lung Opacity. Employing the Random Forest Classifier and a rigorous k-5 cross-validation framework, we aimed to enhance diagnostic accuracy for one of the most urgent medical challenges today—rapid and reliable COVID-19 detection. The analysis revealed an average accuracy of 51%, with varying precision and recall across different folds. The F1-score remained consistently around 35%, indicating a need for improved balance between precision and recall. Visualizations such as performance metric trends and a confusion matrix provided further insight into the classifier's performance, highlighting a notable degree of misclassification. Despite moderate success in the automated classification of the images, our research illustrates the complexity of applying machine learning to medical imaging, especially in differentiating between diseases with overlapping radiographic features. The study’s findings emphasize the potential of machine learning models to support diagnostic processes and suggest the necessity of advanced pre-processing techniques and extended datasets for enhanced model training. The research contributes to the growing body of knowledge in computational diagnostics and underscores the importance of developing robust, accurate machine learning tools to aid in the global healthcare crisis precipitated by the pandemic.

Downloads

Download data is not yet available.

References

Y. Zhao, “Classification of Zambian grasslands using random forest feature importance selection during the optimal phenological period,” Ecol. Indic., vol. 135, 2022, doi: 10.1016/j.ecolind.2021.108529.

O. S. Djandja, “Random forest-based modeling for insights on phosphorus content in hydrochar produced from hydrothermal carbonization of sewage sludge,” Energy, vol. 245, 2022, doi: 10.1016/j.energy.2022.123295.

R. A. Disha, “Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique,” Cybersecurity, vol. 5, no. 1, 2022, doi: 10.1186/s42400-021-00103-8.

M. Salem, “Random Forest modelling and evaluation of the performance of a full-scale subsurface constructed wetland plant in Egypt,” Ain Shams Eng. J., vol. 13, no. 6, 2022, doi: 10.1016/j.asej.2022.101778.

Y. Jusman, “Classification System of Malaria Disease with Hu Moment Invariant and Support Vector Machines,” Proc. - 2022 2nd Int. Conf. Electron. Electr. Eng. Intell. Syst. ICE3IS 2022, pp. 365–368, 2022, doi: 10.1109/ICE3IS56585.2022.10010304.

Y. Jusman, “Machine Learnings of Dental Caries Images based on Hu Moment Invariants Features,” Proc. - 2021 Int. Semin. Appl. Technol. Inf. Commun. IT Oppor. Creat. Digit. Innov. Commun. within Glob. Pandemic, iSemantic 2021, pp. 296–299, 2021, doi: 10.1109/iSemantic52711.2021.9573208.

B. P. Sari, “Classification System for Cervical Cell Images based on Hu Moment Invariants Methods and Support Vector Machine,” 2021 Int. Conf. Intell. Technol. CONIT 2021, 2021, doi: 10.1109/CONIT51480.2021.9498353.

H. Azis, R. D. Mallongi, D. Lantara, and Y. Salim, “Comparison of Floyd-Warshall Algorithm and Greedy Algorithm in Determining the Shortest Route,” Proc. - 2nd East Indones. Conf. Comput. Inf. Technol. Internet Things Ind. EIConCIT 2018, pp. 294–298, 2018, doi: 10.1109/EIConCIT.2018.8878582.

D. Anggreani, I. A. E. Zaeni, A. N. Handayani, H. Azis, and A. R. Manga’, “Multivariate Data Model Prediction Analysis Using Backpropagation Neural Network Method,” in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), 2021, pp. 239–243, doi: 10.1109/EIConCIT50028.2021.9431879.

A. Hasnain, “Assessing the ambient air quality patterns associated to the COVID-19 outbreak in the Yangtze River Delta: A random forest approach,” Chemosphere, vol. 314, 2023, doi: 10.1016/j.chemosphere.2022.137638.

X. Yu, “Random forest algorithm-based classification model of pesticide aquatic toxicity to fishes,” Aquat. Toxicol., vol. 251, 2022, doi: 10.1016/j.aquatox.2022.106265.

Y. Xin, “Predicting depression among rural and urban disabled elderly in China using a random forest classifier,” BMC Psychiatry, vol. 22, no. 1, 2022, doi: 10.1186/s12888-022-03742-4.

A. Kumar, “Multilevel thresholding for crop image segmentation based on recursive minimum cross entropy using a swarm-based technique,” Comput. Electron. Agric., vol. 203, 2022, doi: 10.1016/j.compag.2022.107488.

Y. Jusman, “Classification System for Leukemia Cell Images based on Hu Moment Invariants and Support Vector Machines,” Proc. - 2021 11th IEEE Int. Conf. Control Syst. Comput. Eng. ICCSCE 2021, pp. 137–141, 2021, doi: 10.1109/ICCSCE52189.2021.9530974.

L. Abualigah, “Multilevel thresholding image segmentation using meta-heuristic optimization algorithms: comparative analysis, open challenges and new trends,” Appl. Intell., vol. 53, no. 10, pp. 11654–11704, 2023, doi: 10.1007/s10489-022-04064-4.

E. Turajlic, “Multilevel image thresholding based on Rao algorithms and Kapur’s Entropy,” 2022 28th International Conference on Information, Communication and Automation Technologies, ICAT 2022 - Proceedings. 2022, doi: 10.1109/ICAT54566.2022.9811171.

T. Wu, “Image Segmentation via Fischer-Burmeister Total Variation and Thresholding,” Adv. Appl. Math. Mech., vol. 14, no. 4, pp. 960–988, 2022, doi: 10.4208/AAMM.OA-2021-0126.

N. Rismayanti, A. Naswin, U. Zaky, M. Zakariyah, and D. A. Purnamasari, “Evaluating Thresholding-Based Segmentation and Humoment Feature Extraction in Acute Lymphoblastic Leukemia Classification using Gaussian Naive Bayes,” Int. J. Artif. Intell. Med. Issues, vol. 1, no. 2, 2023, doi: 10.56705/ijaimi.v1i2.99.

U. Zaky, A. Naswin, S. Sumiyatun, and ..., “Performance Analysis of the Decision Tree Classification Algorithm on the Water Quality and Potability Dataset,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i3.113.

S. Hidayat, H. M. T. Ramadhan, and ..., “Comparison of K-Nearest Neighbor and Decision Tree Methods using Principal Component Analysis Technique in Heart Disease Classification,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i2.70.

H. A. Siregar, M. Z. Raditya, A. N. Yesa, and ..., “Comparison of Classification Algorithm Performance for Diabetes Prediction Using Orange Data Mining,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i3.103.

H. Oumarou and N. Rismayanti, “Automated Classification of Empon Plants: A Comparative Study Using Hu Moments and K-NN Algorithm,” Indones. J. Data …, 2023, doi: 10.56705/ijodas.v4i3.115.

T. E. Tarigan, E. Susanti, M. I. Siami, I. Arfiani, and ..., “Performance Metrics of AdaBoost and Random Forest in Multi-Class Eye Disease Identification: An Imbalanced Dataset Approach,” … Artif. Intell. …, 2023, doi: 10.56705/ijaimi.v1i2.98.

S. Khomsah and E. Faizal, “Effectiveness Evaluation of the RandomForest Algorithm in Classifying CancerLips Data,” … Artif. Intell. Med. …, 2023, doi: 10.56705/ijaimi.v1i1.84.

P. Nagaraj, “Ensemble Machine Learning (Grid Search Random Forest) based Enhanced Medical Expert Recommendation System for Diabetes Mellitus Prediction,” 3rd International Conference on Electronics and Sustainable Communication Systems, ICESC 2022 - Proceedings. pp. 757–765, 2022, doi: 10.1109/ICESC54411.2022.9885312.

Y. Gu, “Predicting intersection crash frequency using connected vehicle data: A framework for geographical random forest,” Accid. Anal. Prev., vol. 179, 2023, doi: 10.1016/j.aap.2022.106880.

M. Mafarja, “Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning,” Appl. Intell., vol. 53, no. 15, pp. 18715–18757, 2023, doi: 10.1007/s10489-022-04427-x.

A. Balaram, “Prediction of software fault-prone classes using ensemble random forest with adaptive synthetic sampling algorithm,” Autom. Softw. Eng., vol. 29, no. 1, 2022, doi: 10.1007/s10515-021-00311-z.

D. Kim, “Classification of surface settlement levels induced by TBM driving in urban areas using random forest with data-driven feature selection,” Autom. Constr., vol. 135, 2022, doi: 10.1016/j.autcon.2021.104109.

M. Khushi, “A Comparative Performance Analysis of Data Resampling Methods on Imbalance Medical Data,” IEEE Access, vol. 9, pp. 109960–109975, 2021, doi: 10.1109/ACCESS.2021.3102399.

S. Rahman, “Performance analysis of boosting classifiers in recognizing activities of daily living,” Int. J. Environ. Res. Public Health, vol. 17, no. 3, 2020, doi: 10.3390/ijerph17031082.

P. Sharma, “Performance analysis of deep learning CNN models for disease detection in plants using image segmentation,” Inf. Process. Agric., vol. 7, no. 4, pp. 566–574, 2020, doi: 10.1016/j.inpa.2019.11.001.

Published
2024-03-31
How to Cite
Sinra, A., & Husni Angriani. (2024). Automated Classification of COVID-19 Chest X-ray Images Using Ensemble Machine Learning Methods. Indonesian Journal of Data and Science, 5(1), 45-53. https://doi.org/10.56705/ijodas.v5i1.127