Predictive Modeling of Air Quality Levels Using Decision Tree Classification: Insights from Environmental and Demographic Factors

  • I Gede Iwan Sudipa Institut Bisnis dan Teknologi Indonesia
  • Muhammad Habibi Universitas Jenderal Achmad Yani Yogyakarta
  • Ery Setiyawan Jullev Atmadji Politeknik Negeri Jember https://orcid.org/0000-0003-0989-4883
  • Ika Arfiani Universitas Ahmad Dahlan

Keywords: Air Quality, Decision Tree, Environmental Factors, Machine Learning, Public Policy

Abstract

Air pollution poses a significant global challenge, adversely impacting public health and environmental sustainability. Understanding the factors influencing air quality is essential for developing effective mitigation strategies. This study aims to analyse key environmental and demographic factors, such as PM2.5 concentration, population density, and proximity to industrial areas, to predict air quality levels using a Decision Tree model. The dataset, comprising 5000 samples, was pre-processed by encoding the target variable and applying Z-score normalization to numerical features. The model was trained on 80% of the data and evaluated on the remaining 20%, achieving an accuracy of 93%. Evaluation metrics, including a classification report and confusion matrix, demonstrated the model's effectiveness in distinguishing between four air quality categories: Good, Moderate, Poor, and Hazardous. PM2.5 emerged as the most critical predictor, followed by demographic and industrial factors. These findings underscore the potential of machine learning models in providing actionable insights for air quality management. The results contribute to public policy by highlighting the need for targeted interventions in high-risk areas and the importance of incorporating environmental data into urban planning. Future work should focus on expanding the feature set and exploring ensemble techniques to further enhance predictive accuracy and robustness.

Downloads

Download data is not yet available.

References

D. Yassine, “Classification of Indoor CO2 Levels: Exploring the Impact of Humidity, Temperature, and Occupancy on Air Quality Using Machine Learning Model,” Proceedings of 2024 1st Edition of the Mediterranean Smart Cities Conference, MSCC 2024. 2024, doi: 10.1109/MSCC62288.2024.10697053.

E. Dossev, “Decision Trees for Event Signature Classification on Fiber Optic Cables in Quaternion Coordinates,” 2022 European Conference on Optical Communication, ECOC 2022. 2022.

R. A. Raj, “Classification and Prediction of Incipient Faults in Transformer Oil by Supervised Machine Learning using Decision Tree,” 2023 3rd International Conference on Artificial Intelligence and Signal Processing, AISP 2023. 2023, doi: 10.1109/AISP57993.2023.10134566.

I. Kilic, “Classification of Spyware from Network Packets with Decision Trees Using Recursive Feature Elimination (RFE),” 32nd IEEE Conference on Signal Processing and Communications Applications, SIU 2024 - Proceedings. 2024, doi: 10.1109/SIU61531.2024.10600885.

K. Kamyab-Hesari, “Machine learning for classification of cutaneous sebaceous neoplasms: implementing decision tree model using cytological and architectural features,” Diagn. Pathol., vol. 18, no. 1, 2023, doi: 10.1186/s13000-023-01378-w.

Y. Chen, “Decision tree-based classification in coastal area integrating polarimetric SAR and optical data,” Data Technol. Appl., vol. 56, no. 3, pp. 342–357, 2022, doi: 10.1108/DTA-08-2019-0149.

M. Aqib, “Classification of Edge Applications using Decision Tree, K-NN, & SVM Classifier,” 2022 IEEE Students Conf. Eng. Syst. SCES 2022, 2022, doi: 10.1109/SCES55490.2022.9887690.

S. D. Permai, “Multiclass Classification for Air Quality In Jakarta Using Support Vector Machine and Multi-Layer Perceptron Classifier,” 2022 3rd International Conference on Artificial Intelligence and Data Sciences: Championing Innovations in Artificial Intelligence and Data Sciences for Sustainable Future, AiDAS 2022 - Proceedings. pp. 198–202, 2022, doi: 10.1109/AiDAS56890.2022.9918697.

S. Rani, “Machine Learning-based Multiclass Classification Model for Effective Air Quality Prediction,” 2023 IEEE IAS Global Conference on Emerging Technologies, GlobConET 2023. 2023, doi: 10.1109/GlobConET56651.2023.10149947.

U. Zaky, A. Naswin, S. Sumiyatun, and ..., “Performance Analysis of the Decision Tree Classification Algorithm on the Water Quality and Potability Dataset,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i3.113.

D. Widyawati, A. Faradibah, and ..., “Comparison Analysis of Classification Model Performance in Lung Cancer Prediction Using Decision Tree, Naive Bayes, and Support Vector Machine,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i2.76.

F. T. Admojo and N. Rismayanti, “Estimating Obesity Levels Using Decision Trees and K-Fold Cross-Validation: A Study on Eating Habits and Physical Conditions,” Indones. J. Data …, 2024, doi: 10.56705/ijodas.v5i1.126.

M. N. Hasan, “Fetal Brain Planes Classification Using Deep Ensemble Transfer Learning from U-Net Segmented Fetal Neurosonography Images,” Int. J. Image, Graph. Signal Process., vol. 16, no. 4, pp. 74–86, 2024, doi: 10.5815/ijigsp.2024.04.06.

T. T. Fousiya, “Diabetic Retinopathy Classification Based on Segmented Retinal Vasculature of Fundus Images Using Attention U-NET,” INDICON 2022 - 2022 IEEE 19th India Council International Conference. 2022, doi: 10.1109/INDICON56171.2022.10039734.

R. Rohan, “Classification of cardiac arrhythmia diseases from obstructive sleep apnea signals using decision tree classifier,” Int. J. Comput. Inf. Syst. Ind. Manag. Appl., vol. 12, pp. 248–264, 2020.

D. R. Nemade, “Diabetes prediction using BPSO and decision tree classifier,” 2nd Int. Conf. Data, Eng. Appl. IDEA 2020, 2020, doi: 10.1109/IDEA49133.2020.9170744.

I. A. P. Banlawe, “Decision Tree Learning Algorithm and Naïve Bayes Classifier Algorithm Comparative Classification for Mango Pulp Weevil Mating Activity,” 2021 IEEE Int. Conf. Autom. Control Intell. Syst. I2CACIS 2021 - Proc., pp. 317–322, 2021, doi: 10.1109/I2CACIS52118.2021.9495863.

J. A. D. de Jesus Ferreira, “Decision tree classifiers for unmanned aircraft configuration selection,” Aircr. Eng. Aerosp. Technol., vol. 93, no. 6, pp. 1122–1132, 2021, doi: 10.1108/AEAT-03-2021-0074.

A. Naswin and A. P. Wibowo, “Performance Analysis of the Decision Tree Classification Algorithm on the Pneumonia Dataset,” … Artif. Intell. Med. …, 2023, doi: 10.56705/ijaimi.v1i1.83.

I. P. A. Pratama, E. S. J. Atmadji, and ..., “Evaluating the Performance of Voting Classifier in Multiclass Classification of Dry Bean Varieties,” Indones. J. …, 2024, doi: 10.56705/ijodas.v5i1.124.

S. Hidayat, H. M. T. Ramadhan, and ..., “Comparison of K-Nearest Neighbor and Decision Tree Methods using Principal Component Analysis Technique in Heart Disease Classification,” Indones. J. …, 2023, doi: 10.56705/ijodas.v4i2.70.

Published
2024-12-31
How to Cite
Iwan Sudipa, I. G., Habibi, M., Jullev Atmadji, E. S., & Arfiani, I. (2024). Predictive Modeling of Air Quality Levels Using Decision Tree Classification: Insights from Environmental and Demographic Factors. Indonesian Journal of Data and Science, 5(3), 251-258. https://doi.org/10.56705/ijodas.v5i3.201