Predicting Plant Growth Stages Using Random Forest Classifier: A Machine Learning Approach
Abstract
The optimization of plant growth through predictive modelling is a crucial aspect of modern agricultural practices. This study investigates the application of a Random Forest Classifier to predict plant growth stages based on various environmental and management factors. The dataset, sourced from Kaggle, includes variables such as soil type, sunlight hours, water frequency, fertilizer type, temperature, and humidity. The research involves extensive data pre-processing, including encoding categorical variables, scaling data, and splitting it into training (80%) and testing (20%) sets. The Random Forest Classifier is implemented with 5-fold cross-validation, and its performance is evaluated using accuracy, precision, recall, and F1-score metrics. The model exhibits robust performance with an average accuracy of 84.27%, precision of 85.59%, recall of 84.27%, and F1-score of 83.98%. Visualization techniques such as correlation heatmaps, PCA plots, t-SNE plots, and violin plots are used to provide insights into the data structure and feature relationships. The results confirm the hypothesis that machine learning can effectively predict plant growth stages, offering significant implications for precision agriculture. By accurately identifying growth stages, farmers and greenhouse managers can optimize resource allocation and management practices, leading to enhanced crop yields and sustainability. The study's limitations include the specificity of the dataset and the sole use of the Random Forest Classifier. Future research should explore additional machine learning models and incorporate more diverse datasets to improve generalizability. The findings contribute to the growing body of knowledge on the application of machine learning in agriculture and suggest practical applications for improving agricultural productivity
Downloads
References
A. D. Purwanto, “Decision Tree and Random Forest Classification Algorithms for Mangrove Forest Mapping in Sembilang National Park, Indonesia,” Remote Sens., vol. 15, no. 1, 2023, doi: 10.3390/rs15010016.
C. R. Dhivyaa, “Skin lesion classification using decision trees and random forest algorithms,” J. Ambient Intell. Humaniz. Comput., 2020, doi: 10.1007/s12652-020-02675-8.
A. M. Tika, “Classification of potato leaf diseases based on texture, shape and color features using the random forest algorithm,” AIP Conference Proceedings, vol. 2714. 2023, doi: 10.1063/5.0128456.
S. Dasariraju, “Detection and classification of immature leukocytes for diagnosis of acute myeloid leukemia using random forest algorithm,” Bioengineering, vol. 7, no. 4, pp. 1–12, 2020, doi: 10.3390/bioengineering7040120.
H. Moayedi, “Machine-learning-based classification approaches toward recognizing slope stability failure,” Appl. Sci., vol. 9, no. 21, 2019, doi: 10.3390/app9214638.
R. Mohammed, “Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results,” 2020 11th International Conference on Information and Communication Systems, ICICS 2020. pp. 243–248, 2020, doi: 10.1109/ICICS49469.2020.239556.
Z. M. Çinar, “Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0,” Sustain., vol. 12, no. 19, 2020, doi: 10.3390/su12198211.
H. Azis, F. T. Admojo, and E. Susanti, “Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah,” Techno.Com, vol. 19, no. 3, 2020, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Azis, Admojo, Susanti - 2020 - Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah.pdf.
Y. Boer, “Classification of Heart Disease: Comparative Analysis using KNN, Random Forest, Gaussian Naive Bayes, XGBoost, SVM, Decision Tree, and Logistic Regression,” 2023 5th International Conference on Cybernetics and Intelligent Systems, ICORIS 2023. 2023, doi: 10.1109/ICORIS60118.2023.10352195.
Y. Mao, “Disease Classification Based on Eye Movement Features With Decision Tree and Random Forest,” Front. Neurosci., vol. 14, 2020, doi: 10.3389/fnins.2020.00798.
A. Faradibah, D. Widyawati, A. U. T. Syahar, and ..., “Comparison Analysis of Random Forest Classifier, Support Vector Machine, and Artificial Neural Network Performance in Multiclass Brain Tumor Classification,” Indones. J. …, 2023, [Online]. Available: https://www.jurnal.yoctobrain.org/index.php/ijodas/article/view/73.
L. B. C. Tanujayaa, B. Susanto, and A. Saragiha, “Perbandingan Metode Regresi Logistik dan Random Forest untuk Klasifikasi Fitur Mode Audio Spotify,” Indones. J. data Sci., vol. 1, no. 3, pp. 68–78, 2020, doi: https://doi.org/10.33096/ijodas.v1i3.16.
I. Alwiah, U. Zaky, and A. W. Murdiyanto, “Assessing the Predictive Power of Logistic Regression on Liver Disease Prevalence in the Indian Context,” … J. Data Sci., 2024, [Online]. Available: https://jurnal.yoctobrain.org/index.php/ijodas/article/view/121.
F. T. Admojo and N. Rismayanti, “Estimating Obesity Levels Using Decision Trees and K-Fold Cross-Validation: A Study on Eating Habits and Physical Conditions,” Indones. J. Data …, 2024, [Online]. Available: https://www.jurnal.yoctobrain.org/index.php/ijodas/article/view/126.
A. P. Wibowo, M. Taruk, T. E. Tarigan, and ..., “Improving Mental Health Diagnostics through Advanced Algorithmic Models: A Case Study of Bipolar and Depressive Disorders,” Indones. J. …, 2024, [Online]. Available: https://jurnal.yoctobrain.org/index.php/ijodas/article/view/122.
S. Khomsah and E. Faizal, “Effectiveness Evaluation of the RandomForest Algorithm in Classifying CancerLips Data,” … Artif. Intell. Med. …, 2023, [Online]. Available: https://www.jurnal.yoctobrain.org/index.php/ijaimi/article/view/84.
T. E. Tarigan, E. Susanti, M. I. Siami, I. Arfiani, and ..., “Performance Metrics of AdaBoost and Random Forest in Multi-Class Eye Disease Identification: An Imbalanced Dataset Approach,” … Artif. Intell. …, 2023, [Online]. Available: https://jurnal.yoctobrain.org/index.php/ijaimi/article/view/98.
X. Yu, “Random forest algorithm-based classification model of pesticide aquatic toxicity to fishes,” Aquat. Toxicol., vol. 251, 2022, doi: 10.1016/j.aquatox.2022.106265.
M. Salem, “Random Forest modelling and evaluation of the performance of a full-scale subsurface constructed wetland plant in Egypt,” Ain Shams Eng. J., vol. 13, no. 6, 2022, doi: 10.1016/j.asej.2022.101778.
D. Kim, “Classification of surface settlement levels induced by TBM driving in urban areas using random forest with data-driven feature selection,” Autom. Constr., vol. 135, 2022, doi: 10.1016/j.autcon.2021.104109.
M. Tubagus, S. Syarifuddin, L. Syafie, K. Koderi, and ..., “The effectiveness test of the hybrid learning model based on the learning management system using statictical analysis,” AIP Conf. …, 2023, [Online]. Available: https://pubs.aip.org/aip/acp/article-abstract/2595/1/040031/2890644.
D. Indra, F. Umar, F. Fattah, H. Azis, and ..., “The Microcontroller-Based Technology for Developing Countries in the COVID-19 Pandemic Era,” Spirit Recover., 2024, doi: 10.1201/9781003331674-7/microcontroller-based-technology-developing-countries-covid-19-pandemic-era-dolly-indra-fitriyani-umar-farniwati-fattah-huzain-azis-abdul-rachman-manga.
H. Azis and S. R. Jabir, “Implementasi Aset 3D Rumah Tongkonan Pada Desa Marinding,” Ilmu Komput. untuk Masy., 2023, [Online]. Available: http://103.133.36.110/index.php/ILKOMAS/article/view/1552.
A. Fitria and H. Azis, “Analisis Kinerja Sistem Klasifikasi Skripsi menggunakan Metode Naïve Bayes Classifier,” Pros. Semin. Nas. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 2, pp. 102–106, 2018, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Fitria, Azis - 2018 - Analisis Kinerja Sistem Klasifikasi Skripsi menggunakan Metode Naïve Bayes Classifier.pdf.
M. M. Baharuddin, T. Hasanuddin, and H. Azis, “Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca,” Ilk. J. Ilm., vol. 11, no. 28, pp. 269–274, 2019, [Online]. Available: file:///Users/kbh/Library/Application Support/Mendeley Desktop/Downloaded/Baharuddin, Hasanuddin, Azis - 2019 - Analisis Performa Metode K-Nearest Neighbor untuk Identifikasi Jenis Kaca.pdf.
H. Azis, F. Fattah, and P. Putri, “Performa Klasifikasi K-NN dan Cross-validation pada Data Pasien Pengidap Penyakit Jantung,” Ilk. J. Ilm., vol. 12, no. 2, pp. 81–86, 2020, [Online]. Available: file:///Users/kbh/Downloads/507-2012-5-PB.pdf.
Copyright (c) 2024 Indonesian Journal of Data and Science
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- The work is not under consideration for publication elsewhere.
- The work has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with Indonesian Journal of Data and Science agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. (CC BY-NC 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.