Evaluating Machine Learning Approaches: A Comparative Study of Random Forest and Neural Networks in Grade Classification

Subitha Sivakumar; Sivakumar Venkataraman

doi:10.56705/ijodas.v6i1.240

Authors

Subitha Sivakumar New Era College of Arts, Science & Technology
Sivakumar Venkataraman Botho University

DOI:

https://doi.org/10.56705/ijodas.v6i1.240

Keywords:

Data Pre-processing, Educational Data Mining, Grade Classification, Hyperparameter Tuning, Model Evaluation, Neural Networks, Random Forest

Abstract

Introduction: Accurate grade classification in education is essential for early intervention and performance assessment. This study presents a comparative analysis of Random Forest and Neural Networks in classifying student grades using a dataset of 2,392 high school students. The aim is to evaluate both models’ predictive performance and interpretability in an educational data mining context. Methods: The dataset, containing academic and demographic features, was pre-processed by handling missing values, encoding categorical variables, and scaling numerical features. Grades were categorized into five classes: A, B, C, D, and F. Both models were implemented using Python and evaluated with metrics including accuracy, precision, recall, and F1-score. Hyperparameter tuning was performed via Grid Search with cross-validation to optimize performance. Results: The Random Forest model achieved a baseline accuracy of 70.2%, outperforming Neural Networks at 69.1%. After tuning, Random Forest improved to 71.45% accuracy, while Neural Networks reached 70.49%. Both models demonstrated strong precision and recall in identifying failing students (class F), with F1-scores of 0.90 and 0.89, respectively. However, classification of mid-range grades (A to D) remained challenging due to class overlap. Feature importance analysis highlighted interpretability advantages in the Random Forest model. Conclusions: Both models are effective for grade classification, with Random Forest offering slightly better accuracy and interpretability. Neural Networks, while slightly less accurate, capture nonlinear relationships effectively post-tuning. The results suggest that model selection should be guided by context-specific needs, balancing performance with transparency. Future work may include ensemble techniques and expanded feature sets to improve classification robustness.

Downloads

Download data is not yet available.

References

M. A. Kamaruddin, M. S. Mispan, A. Z. Jidin, H. M. Nasir, and N. I. M. Nor, “Low-cost integrated circuit packaging defect classification system using edge impulse and ESP32CAM,” Int. J. Electr. Comput. Eng., vol. 15, no. 1, pp. 156–162, 2025, doi: 10.11591/ijece.v15i1.pp156-162.

H. Jin, K. Han, H. Xia, B. Xu, and X. Jin, “Detection of weeds in vegetables using image classification neural networks and image processing,” Front. Phys., vol. 13, Jan. 2025, doi: 10.3389/fphy.2025.1496778.

H. Pooja and S. Soma, “Enhanced Deep Stacked CapsNet Ensemble Gazelle Neural Network for multi‐level fabric defect classification,” Color. Technol., Jan. 2025, doi: 10.1111/cote.12805.

E. S. Cutur and N. G. Inan, “Multi-class Classification of Retinal Eye Diseases from Ophthalmoscopy Images Using Transfer Learning-Based Vision Transformers,” J. Imaging Informatics Med., Jan. 2025, doi: 10.1007/s10278-025-01416-7.

M. De Gregorio and M. Giordano, “An experimental evaluation of weightless neural networks for multi-class classification,” Appl. Soft Comput., vol. 72, pp. 338–354, Nov. 2018, doi: 10.1016/j.asoc.2018.07.052.

L. Novita, W. Fuadi, and K. Kurniawati, “Cataract Eye Disease Diagnosis Using the Random Forest Method,” Int. J. Eng. Sci. Inf. Technol., vol. 5, no. 2, pp. 33–41, Jan. 2025, doi: 10.52088/ijesty.v5i2.777.

N. Ahmad and S. G. Ali, “Mapping forest types along ecological gradient in Pakistan,” Environ. Res. Commun., vol. 7, no. 3, p. 035023, Mar. 2025, doi: 10.1088/2515-7620/adaf11.

W. Zhuo and A. Ahmad, “HCRF: an improved random forest algorithm based on hierarchical clustering,” Indones. J. Electr. Eng. Comput. Sci., vol. 38, no. 1, p. 578, Apr. 2025, doi: 10.11591/ijeecs.v38.i1.pp578-586.

A. Mohanta et al., “Harnessing Spectral Libraries From AVIRIS‐NG Data for Precise PFT Classification: A Deep Learning Approach,” Plant. Cell Environ., Jan. 2025, doi: 10.1111/pce.15393.

M. Sholeh, “Comparison of Z-score, min-max, and no normalization methods using support vector machine algorithm to predict student’s timely graduation,” AIP Conference Proceedings, vol. 3077, no. 1. 2024, doi: 10.1063/5.0202505.

S. Balaji, “Enhancing Diabetic Retinopathy Image Classification using CNN, Resnet, and Googlenet Models with Z-Score Normalization and GLCM Feature Extraction,” Proceedings of the 2nd International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics, ICIITCEE 2024. 2024, doi: 10.1109/IITCEE59897.2024.10467709.

D. Qi, “Improving Unbalanced Security X-Ray Image Classification Using VGG16 and AlexNet with Z-Score Normalization and Augmentation,” Lecture Notes in Electrical Engineering, vol. 1182. pp. 205–217, 2024, doi: 10.1007/978-981-97-1463-6_14.

D. Geem, “Progression of Pediatric Crohn’s Disease Is Associated With Anti–Tumor Necrosis Factor Timing and Body Mass Index Z-Score Normalization,” Clin. Gastroenterol. Hepatol., vol. 22, no. 2, pp. 368–376, 2024, doi: 10.1016/j.cgh.2023.08.042.

A. Faradibah, D. Widyawati, A. U. T. Syahar, and ..., “Comparison Analysis of Random Forest Classifier, Support Vector Machine, and Artificial Neural Network Performance in Multiclass Brain Tumor Classification,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i2.73.

D. Ghunimat, “Prediction of concrete compressive strength with GGBFS and fly ash using multilayer perceptron algorithm, random forest regression and k-nearest neighbor regression,” Asian J. Civ. Eng., vol. 24, no. 1, pp. 169–177, 2023, doi: 10.1007/s42107-022-00495-z.

P. Palimkar, R. N. Shaw, and A. Ghosh, “Machine Learning Technique to Prognosis Diabetes Disease: Random Forest Classifier Approach,” 2022, pp. 219–244.

M. R. Krause, “Random forest regression for optimizing variable planting rates for corn and soybean using topographical and soil data,” Agron. J., vol. 112, no. 6, pp. 5045–5066, 2020, doi: 10.1002/agj2.20442.

M. Bejiga, A. Zeggada, A. Nouffidj, and F. Melgani, “A Convolutional Neural Network Approach for Assisting Avalanche Search and Rescue Operations with UAV Imagery,” Remote Sens., vol. 9, no. 2, p. 100, Jan. 2017, doi: 10.3390/rs9020100.

S. Leva, A. Dolara, F. Grimaccia, M. Mussetta, and E. Ogliari, “Analysis and validation of 24 hours ahead neural network forecasting of photovoltaic output power,” Math. Comput. Simul., vol. 131, pp. 88–100, Jan. 2017, doi: 10.1016/j.matcom.2015.05.010.

D. Gholamiangonabadi, “Deep Neural Networks for Human Activity Recognition with Wearable Sensors: Leave-One-Subject-Out Cross-Validation for Model Selection,” IEEE Access, vol. 8, pp. 133982–133994, 2020, doi: 10.1109/ACCESS.2020.3010715.

P. Henrique Ponte de Lucena and L. Mauro Lima de Campos, “Classification of Obesity Level Using Deep Neural Networks,” 2024, pp. 99–107.

A. R. Bhamare, S. Katharguppe, and J. Silviya Nancy, “Deep Neural Networks for Lie Detection with Attention on Bio-signals,” 2020 7th Int. Conf. Soft Comput. Mach. Intell. ISCMI 2020, no. November 2020, pp. 143–147, 2020, doi: 10.1109/ISCMI51676.2020.9311575.

R. Ghawi and J. Pfeffer, “Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity,” Open Comput. Sci., vol. 9, no. 1, pp. 160–180, Jan. 2019, doi: 10.1515/comp-2019-0011.

A. R. Manga, M. A. F. Latief, A. W. M. Gaffar, and ..., “Hyperparameter Tuning of Identity Block Uses an Imbalance Dataset with Hyperband Method,” 2024 18th …, 2024, doi: 10.1109/IMCOM60618.2024.10418427.

M. Ahsan, M. Mahmud, P. Saha, K. Gupta, and Z. Siddique, “Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance,” Technologies, vol. 9, no. 3, p. 52, Jul. 2021, doi: 10.3390/technologies9030052.