Classification of Mushroom Edibility Using K-Nearest Neighbors: A Machine Learning Approach
Abstract
This study investigates the use of the K-Nearest Neighbors (KNN) algorithm for the binary classification of mushroom edibility using a cleaned version of the UCI Mushroom Dataset. The dataset underwent pre-processing techniques such as modal imputation, one-hot encoding, z-score normalization, and feature selection to ensure data quality. The model was trained on 80% of the dataset and evaluated on the remaining 20%, achieving an overall accuracy of 99%. Evaluation metrics, including precision, recall, and F1-score, confirmed the model's effectiveness in distinguishing between edible and poisonous mushrooms, with minimal misclassification errors. Despite its high performance, the study identified scalability as a limitation due to the computational complexity of KNN, suggesting that future research should explore alternative algorithms for enhanced efficiency. This research underscores the importance of pre-processing and hyperparameter optimization in building reliable classification models for food safety applications.
Downloads
References
S. K. Pal, “Mushroom Classification Model to Check Edibility using Machine Learning,” Proceedings of the 17th INDIACom; 2023 10th International Conference on Computing for Sustainable Global Development, INDIACom 2023. pp. 214–217, 2023.
S. Verma, “A Comprehensive Study on the Classification of the Edibility of Mushrooms,” Proceedings of the 2023 12th International Conference on System Modeling and Advancement in Research Trends, SMART 2023. pp. 7–13, 2023, doi: 10.1109/SMART59791.2023.10428619.
M. S. Morshed, “Predicting Mushroom Edibility with Effective Classification and Efficient Feature Selection Techniques,” International Conference on Robotics, Electrical and Signal Processing Techniques, vol. 2023. pp. 1–5, 2023, doi: 10.1109/ICREST57604.2023.10070049.
M. S. Devi, “Dimensionality Reduction Based Component Discriminant Factor Implication for Mushroom Edibility Classification Using Machine Learning,” Lecture Notes in Networks and Systems, vol. 311. pp. 1–15, 2022, doi: 10.1007/978-981-16-5529-6_1.
C. Budak, “Classification of the Ionospheric Disturbances Caused by Geomagnetic and Seismic Activity with K-Nearest Neighbors Algorithm,” Wirel. Pers. Commun., vol. 134, no. 3, pp. 1551–1569, 2024, doi: 10.1007/s11277-024-10965-z.
R. A. Dharmmesta, “Classification of Foot Kicks in Taekwondo Using SVM (Support Vector Machine) and KNN (K-Nearest Neighbors) Algorithms,” Proceedings of the 2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2022. pp. 36–41, 2022, doi: 10.1109/IAICT55358.2022.9887475.
M. Mentari, “Classification of Siam Orange Ripeness Level using K-Nearest Neighbors Algorithm and Features Gray Level Run Length Matrix,” Proceeding - COMNETSAT 2023: IEEE International Conference on Communication, Networks and Satellite. pp. 272–277, 2023, doi: 10.1109/COMNETSAT59769.2023.10420620.
H. Oumarou and N. Rismayanti, “Automated Classification of Empon Plants: A Comparative Study Using Hu Moments and K-NN Algorithm,” Indones. J. Data …, 2023, doi: 10.56705/ijodas.v4i3.115.
D. Ratnasari, “Comparison of Performance of Four Distance Metric Algorithms in K-Nearest Neighbor Method on Diabetes Patient Data,” Indones. J. Data Sci., 2023, doi: 10.56705/ijodas.v4i2.71.
I. G. I. Sudipa, R. A. Azdy, I. Arfiani, and ..., “Leveraging K-Nearest Neighbors for Enhanced Fruit Classification and Quality Assessment,” Indones. J. …, 2024, doi: 10.56705/ijodas.v5i1.125.
H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020, doi: 10.33096/ilkom.v12i2.507.81-86.
M. Sholeh, “Comparison of Z-score, min-max, and no normalization methods using support vector machine algorithm to predict student’s timely graduation,” AIP Conference Proceedings, vol. 3077, no. 1. 2024, doi: 10.1063/5.0202505.
L. Peng, “Dual-Structure Elements Morphological Filtering and Local Z-Score Normalization for Infrared Small Target Detection against Heavy Clouds,” Remote Sens., vol. 16, no. 13, 2024, doi: 10.3390/rs16132343.
D. Geem, “Progression of Pediatric Crohn’s Disease Is Associated With Anti–Tumor Necrosis Factor Timing and Body Mass Index Z-Score Normalization,” Clin. Gastroenterol. Hepatol., vol. 22, no. 2, pp. 368–376, 2024, doi: 10.1016/j.cgh.2023.08.042.
R. Thakur, “Classification Performance of Land Use from Multispectral Remote Sensing Images using Decision Tree, K-Nearest Neighbor, Random Forest and Support Vector Machine Using EuroSAT Data,” Int. J. Intell. Syst. Appl. Eng., vol. 10, no. 1, pp. 67–77, 2022.
P. Suksomboon, “Performance Comparison Classification using k-Nearest Neighbors and Random Forest Classification Techniques,” 2022 3rd International Conference on Big Data Analytics and Practices, IBDAP 2022. pp. 43–46, 2022, doi: 10.1109/IBDAP55587.2022.9907218.
I. Budiman, “Classification of Bird Species using K-Nearest Neighbor Algorithm,” 2022 10th International Conference on Cyber and IT Service Management, CITSM 2022. 2022, doi: 10.1109/CITSM56380.2022.9936012.
E. Najwaini, T. E. Tarigan, and F. P. Putra, “Application of the K-Nearest Neighbors (KNN) Algorithm on the Brain Tumor Dataset,” … Artif. Intell. …, 2023, doi: 10.56705/ijaimi.v1i1.85.
A. Aisyah and S. Anraeni, “Analisis penerapan metode K-Nearest Neighbor (K-NN) pada dataset citra penyakit malaria,” Indones. J. Data Sci., 2022, doi: 10.56705/ijodas.v3i1.22.
F. T. Admojo and N. Rismayanti, “Estimating Obesity Levels Using Decision Trees and K-Fold Cross-Validation: A Study on Eating Habits and Physical Conditions,” Indones. J. Data …, 2024, doi: 10.56705/ijodas.v5i1.126.
I. Alwiah, U. Zaky, and A. W. Murdiyanto, “Assessing the Predictive Power of Logistic Regression on Liver Disease Prevalence in the Indian Context,” … J. Data Sci., 2024, doi: 10.56705/ijodas.v5i1.121.
A. P. Wibowo, M. Taruk, T. E. Tarigan, and ..., “Improving Mental Health Diagnostics through Advanced Algorithmic Models: A Case Study of Bipolar and Depressive Disorders,” Indones. J. …, 2024, doi: 10.56705/ijodas.v5i1.122.
I. P. A. Pratama, E. S. J. Atmadji, and ..., “Evaluating the Performance of Voting Classifier in Multiclass Classification of Dry Bean Varieties,” Indones. J. …, 2024, doi: 10.56705/ijodas.v5i1.124.

Copyright (c) 2024 Indonesian Journal of Data and Science

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- The work is not under consideration for publication elsewhere.
- The work has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with Indonesian Journal of Data and Science agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.