Performance Metrics of AdaBoost and Random Forest in Multi-Class Eye Disease Identification: An Imbalanced Dataset Approach

  • Thomas Edyson Tarigan Universitas Teknologi Digital Indonesia
  • Erma Susanti Institut Sains & Teknologi AKPRIND
  • M. Ikbal Siami Institut Teknologi dan Bisnis STIKOM Ambon
  • Ika Arfiani Universitas Ahmad Dahlan
  • Agus Aan Jiwa Permana Universitas Pendidikan Ganesha
  • I Made Sunia Raharja Universitas Udayana

Keywords: AdaBoost, Random Forest Classifier, Eye Disease Classification, Machine Learning, Imbalanced Dataset, Medical Image Analysis, Canny Edge Detection, Hu Moments, Diagnostic Accuracy

Abstract

This study presents a comprehensive evaluation of AdaBoost and Random Forest Classifier algorithms in the classification of eye diseases, focusing on a challenging scenario involving an imbalanced dataset. Eye diseases, particularly Cataract, Diabetic Retinopathy, Glaucoma, and Normal eye conditions, pose significant diagnostic challenges, and the advent of machine learning offers promising avenues for enhancing diagnostic accuracy. Our research utilizes a dataset preprocessed with Canny edge detection for image segmentation and Hu Moments for feature extraction, providing a robust foundation for the comparative analysis. The performance of the algorithms is assessed using a 5-fold cross-validation approach, with accuracy, precision, recall, and F1-score as the key metrics. The results indicate that the Random Forest Classifier outperforms AdaBoost across these metrics, albeit with moderate overall performance. This finding underscores the potential and limitations of using advanced machine learning techniques for medical image analysis, particularly in the context of imbalanced datasets. The study contributes to the field by providing insights into the effectiveness of different machine learning algorithms in handling the complexities of medical image classification. For future research, it recommends exploring a diverse range of image processing techniques, delving into other sophisticated machine learning models, and extending the study to encompass a wider array of eye diseases. These findings have practical implications in guiding the selection of machine learning tools for medical diagnostics and highlight the need for continuous improvement in automated systems for enhanced patient care.

References

S. Kumari, D. Kumar, and M. Mittal, “An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier,” Int. J. Cogn. Comput. Eng., vol. 2, pp. 40–46, Jun. 2021, doi: 10.1016/j.ijcce.2021.01.001.

M. Zounemat-Kermani, O. Batelaan, M. Fadaee, and R. Hinkelmann, “Ensemble machine learning paradigms in hydrology: A review,” J. Hydrol., vol. 598, p. 126266, Jul. 2021, doi: 10.1016/j.jhydrol.2021.126266.

Q. Huang, F. Zhang, and X. Li, “Machine Learning in Ultrasound Computer-Aided Diagnostic Systems: A Survey,” Biomed Res. Int., vol. 2018, pp. 1–10, 2018, doi: 10.1155/2018/5137904.

E. K. Shea and R. S. Hess, “Assessment of postprandial hyperglycemia and circadian fluctuation of glucose concentrations in diabetic dogs using a flash glucose monitoring system,” J. Vet. Intern. Med., vol. 35, no. 2, pp. 843–852, Mar. 2021, doi: 10.1111/jvim.16046.

A. Sindy, “Pattern of Patients and Diseases During Mass Transit: The Day of Arafat Experience,” Pakistan J. Med. Sci., vol. 31, no. 5, Sep. 2015, doi: 10.12669/pjms.315.8017.

M. Ziacchi et al., “Bipolar active fixation left ventricular lead or quadripolar passive fixation lead? An Italian multicenter experience,” J. Cardiovasc. Med., vol. 20, no. 4, pp. 192–200, Apr. 2019, doi: 10.2459/JCM.0000000000000778.

W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geosci. Front., vol. 12, no. 1, pp. 469–477, Jan. 2021, doi: 10.1016/j.gsf.2020.03.007.

A. Merghadi et al., “Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance,” Earth-Science Rev., vol. 207, p. 103225, Aug. 2020, doi: 10.1016/j.earscirev.2020.103225.

M. M. Baharuddin, T. Hasanuddin, and H. Azis, “Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca,” Ilk. J. Ilm., vol. 11, no. 28, pp. 269–274, 2019, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Baharuddin, Hasanuddin, Azis - 2019 - Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca.pdf.

H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Azis, Admojo, Susanti - 2020 - Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah.pdf.

A. Nurul, Y. Salim, and H. Azis, “Analisis performa metode Gaussian Naïve Bayes untuk klasifikasi citra tulisan tangan karakter arab,” Indones. J. Data Sci., vol. 3, no. 3, pp. 115–121, 2022, doi: https://doi.org/10.56705/ijodas.v3i3.54.

H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020, [Online]. Available: file:///Users/kbh/Downloads/507-2012-5-PB.pdf.

A. A. Karim, H. Azis, and Y. Salim, “Kinerja Metode C4.5 dalam Penyaluran Bantuan Dana Bencana 1,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 84–87, 2018, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Karim, Azis, Salim - 2018 - Kinerja Metode C4.5 dalam Penyaluran Bantuan Dana Bencana 1.pdf.

A. R. Beeravolu, S. Azam, M. Jonkman, B. Shanmugam, K. Kannoorpatti, and A. Anwar, “Preprocessing of Breast Cancer Images to Create Datasets for Deep-CNN,” IEEE Access, vol. 9, pp. 33438–33463, 2021, doi: 10.1109/ACCESS.2021.3058773.

M. Yasir et al., “Automatic Coastline Extraction and Changes Analysis Using Remote Sensing and GIS Technology,” IEEE Access, vol. 8, pp. 180156–180170, 2020, doi: 10.1109/ACCESS.2020.3027881.

G. W. Wang and J. P. Zhang, “Automatic Recognition of Hub Classification Based on Machine Vision,” Appl. Mech. Mater., vol. 380–384, pp. 3694–3697, Aug. 2013, doi: 10.4028/www.scientific.net/AMM.380-384.3694.

X. Lu et al., “An Outdoor Support Insulator Surface Defects Segmentation Approach via Image Adversarial Reconstruction in High-Speed Railway Traction Substation,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–19, 2022, doi: 10.1109/TIM.2022.3211558.

M. Maniruzzaman, M. J. Rahman, B. Ahammed, and M. M. Abedin, “Classification and prediction of diabetes disease using machine learning paradigm,” Heal. Inf. Sci. Syst., vol. 8, no. 1, p. 7, Dec. 2020, doi: 10.1007/s13755-019-0095-z.

J. Sun, H. Li, H. Fujita, B. Fu, and W. Ai, “Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting,” Inf. Fusion, vol. 54, pp. 128–144, Feb. 2020, doi: 10.1016/j.inffus.2019.07.006.

X. Zhou et al., “Oral delivery of insulin with intelligent glucose-responsive switch for blood glucose regulation,” J. Nanobiotechnology, vol. 18, no. 1, p. 96, Dec. 2020, doi: 10.1186/s12951-020-00652-z.

S. Menina et al., “Energy Envelope and Attenuation Characteristics of High-Frequency (HF) and Very-High-Frequency (VF) Martian Events,” Bull. Seismol. Soc. Am., vol. 111, no. 6, pp. 3016–3034, Dec. 2021, doi: 10.1785/0120210127.

C. S. Pak, C. Y. Heo, J. Shin, S. Y. Moon, S.-W. Cho, and H. J. Kang, “Effects of a Catechol-Functionalized Hyaluronic Acid Patch Combined with Human Adipose-Derived Stem Cells in Diabetic Wound Healing,” Int. J. Mol. Sci., vol. 22, no. 5, p. 2632, Mar. 2021, doi: 10.3390/ijms22052632.
Published
2023-11-30