Drug Recommendation Using Multilabel Classification with Decision Tree Based on Patient Complaints and Diagnoses

Muh Aristsyah Malik; Harlinda; Herdianti Darwis

doi:10.56705/ijodas.v7i1.397

Authors

Muh Aristsyah Malik Universitas Muslim Indonesia
Harlinda Universitas Muslim Indonesia
Herdianti Darwis Universitas Muslim Indonesia

DOI:

https://doi.org/10.56705/ijodas.v7i1.397

Keywords:

Multilabel Classification, Decision Tree, Drug Recommendation System, Electronic Medical Records, Clinical Decision Support

Abstract

This study develops a drug recommendation system using multilabel classification with the Decision Tree algorithm based on patient complaint and diagnosis data from electronic medical records. The dataset consists of patient visit records from a community health center in Pangkajene and Kepulauan Regency and is transformed using multi-hot encoding. Model performance is evaluated under three dataset scenarios (N=500, N=800, and N=1000) using multilabel metrics, including Micro-F1, Samples-F1, Hamming Loss, Jaccard Similarity, Hit@5, Precision@K, and Recall@K. The best Decision Tree model achieved a Micro-F1 score of 0.292, Samples-F1 of 0.281, and Hit@5 of 0.690 on the N=1000 dataset scenario. Bootstrap validation with 1000 iterations indicates relatively stable performance, with narrow confidence intervals across evaluation metrics. These results show that the multilabel Decision Tree model is capable of capturing relationships between patient complaints, diagnoses, and drug therapies while maintaining an interpretable decision structure

Downloads

Download data is not yet available.

References

[1] J. Wang, X. Chen, and Y. Li, “Structure Design and Optimization Algorithm of a Lightweight Drive Rod for Precision Die-Cutting Machine,” Applied Sciences (Switzerland), vol. 13, no. 7, Apr. 2023, doi: 10.3390/app13074211.

[2] A. Raţiu and E.-L. Pop, “Machine Learning in Clinical Decision Making: Applications, Data Limitations and Multidisciplinary Perspectives,” Applied Sciences, vol. 16, no. 2, p. 785, Jan. 2026, doi: 10.3390/app16020785.

[3] X. Yao, A. Rao, and R. Padman, “Analytical approaches for medication management for patient safety: a scoping review,” npj Health Systems, vol. 2, no. 1, Dec. 2025, doi: 10.1038/s44401-025-00052-1.

[4] S. E. M. Purba, “A Comparative Study of Drug Prediction Models using KNN, SVM, and Random Forest,” Journal of Information Systems and Informatics, vol. 7, no. 1, pp. 378–392, Mar. 2025, doi: 10.51519/journalisi.v7i1.1013.

[5] Y. Tang et al., “LAMRec: Label-aware Multi-view Drug Recommendation,” in International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery, Oct. 2024, pp. 2230–2239. doi: 10.1145/3627673.3679656.

[6] H. Darwis, F. A. Syahrir, and L. N. Hayati, “A Hybrid Movie Recommendation System to Address Data Sparsity Using Genre-Based K-Means and Neural Collaborative Filtering,” ILKOM Jurnal Ilmiah, vol. 17, no. 2, pp. 203–212, Sep. 2025, doi: 10.33096/ilkom.v17i2.2868.203-212.

[7] A. Putri, D. Azka Faz, and F. T. Hafizhulloh, “Classification of Drug Types using Decision Tree Algorithm,” Journal of Dinda Data Science, Information Technology, and Data Analytics, vol. 3, no. 2, pp. 65–70, 2023.

[8] L. Zhou, X. Zheng, D. Yang, Y. Wang, X. Bai, and X. Ye, “Application of multi-label classification models for the diagnosis of diabetic complications,” BMC Med. Inform. Decis. Mak., vol. 21, no. 1, Dec. 2021, doi: 10.1186/s12911-021-01525-7.

[9] F. Kamran et al., “Early identification of patients admitted to hospital for covid-19 at risk of clinical deterioration: Model development and multisite external validation study,” The BMJ, vol. 376, Feb. 2022, doi: 10.1136/bmj-2021-068576.

[10] E. Johns et al., “Using machine learning to predict pharmaceutical interventions during medication prescription review in a hospital setting,” American Journal of Health-System Pharmacy, vol. 82, no. 22, pp. 1238–1248, Nov. 2025, doi: 10.1093/ajhp/zxaf089.

[11] Y. Salim, A. P. Utami, A. R. Manga, H. Azis, and F. T. Admojo, “Optimal Strategy for Handling Unbalanced Medical Datasets: Performance Evaluation of K-NN Algorithm Using Sampling Techniques,” Knowledge Engineering and Data Science, vol. 7, no. 2, Dec. 2024, doi: 10.17977/um018v7i22024p176-186.

[12] P. Lestari, L. Belluano, R. A. Rahma, H. Darwis, and A. R. Manga, “Analysis of ensemble machine learning classification comparison on the skin cancer MNIST dataset,” Computer Science and Information Technologies, vol. 5, no. 3, pp. 235–242, 2024, doi: 10.11591/csit.v5i3.pp235-242.

[13] Purnawansyah, A. P. Wibawa, T. Widiyaningtyas, Haviluddin, H. Darwis, and H. Azis, “An in-depth exploration of supervised and semi-supervised learning on face recognition,” Open Computer Science, vol. 15, no. 1, Jan. 2025, doi: 10.1515/comp-2025-0029.

[14] J. L. Montalvo, J. C. Silva, and A. Zamora-Mendez, “TKEO-DESA-Based decision tree for power quality events detection and classification,” Electric Power Systems Research, vol. 252, Jan. 2026, doi: 10.1016/j.epsr.2025.112387.

[15] Dewi Widyawati and Amaliah Faradibah, “Comparison Analysis of Classification Model Performance in Lung Cancer Prediction Using Decision Tree, Naive Bayes, and Support Vector Machine,” Indonesian Journal of Data and Science, vol. 4, no. 2, pp. 80–89, Jul. 2023, doi: 10.56705/ijodas.v4i2.76.

[16] O. Khalaf, A. Ben Ishak, and S. García, “Towards explainable multilabel learning: Fusing label dependency analysis with monotonic decision trees,” Information Fusion, vol. 127, Mar. 2026, doi: 10.1016/j.inffus.2025.103691.

[17] F. Liu, W. Wang, J. Zheng, Y. Xie, X. Wang, and D. Zhang, “EDRMM: enhancing drug recommendation via multi-granularity and multi-attribute representation,” BMC Bioinformatics, vol. 26, no. 1, Dec. 2025, doi: 10.1186/s12859-025-06167-4.

[18] G. Liu et al., “DNMDR: Dynamic networks and multi-view drug representations for safe medication recommendation,” Knowl. Based. Syst., vol. 329, Nov. 2025, doi: 10.1016/j.knosys.2025.114327.

[19] J. Bogatinovski, L. Todorovski, S. Džeroski, and D. Kocev, “Comprehensive comparative study of multi-label classification methods,” Expert Syst. Appl., vol. 203, Oct. 2022, doi: 10.1016/j.eswa.2022.117215.

[20] W. T. Kim et al., “Medication Extraction and Drug Interaction Chatbot: Generative Pretrained Transformer-Powered Chatbot for Drug-Drug Interaction,” Mayo Clinic Proceedings: Digital Health, vol. 2, no. 4, pp. 611–619, Dec. 2024, doi: 10.1016/j.mcpdig.2024.09.001.

[21] J. Khatib Sulaiman Dalam No, H. Akram Abdulqader, and A. Mohsin Abdulazeez, “A Review on Decision Tree Algorithm in Healthcare Applications,” Indonesian Journal of Computer Science.

[22] K. Chen, M. Ao, S. Moon, G. Burns, and Q. Zhu, “Machine learning-based identification of natural history studies in rare diseases: a step toward understanding disease development and outcome,” Journal of Rare Diseases (Germany), vol. 4, no. 1, Dec. 2025, doi: 10.1007/s44162-025-00115-9.

[23] S.-K. Tan, S.-C. Chong, K.-K. Wee, and L.-Y. Chong, “Personalized Healthcare: A Comprehensive Approach for Symptom Diagnosis and Hospital Recommendations Using AI and Location Services,” Journal of Informatics and Web Engineering, vol. 3, no. 1, pp. 117–135, Feb. 2024, doi: 10.33093/jiwe.2024.3.1.8.

[24] R. Su, H. Yang, L. Wei, S. Chen, and Q. Zou, “A multi-label learning model for predicting drug-induced pathology in multi-organ based on toxicogenomics data,” PLoS Comput. Biol., vol. 18, no. 9 September, Sep. 2022, doi: 10.1371/journal.pcbi.1010402.

[25] L. Y. Jiang et al., “Health system-scale language models are all-purpose prediction engines,” Nature, vol. 619, no. 7969, pp. 357–362, Jul. 2023, doi: 10.1038/s41586-023-06160-y.

[26] F. Sogandi, “Identifying diseases symptoms and general rules using supervised and unsupervised machine learning,” Sci. Rep., vol. 14, no. 1, Dec. 2024, doi: 10.1038/s41598-024-69029-8.

[27] T.-T. Nguyen et al., “Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification,” Apr. 2023, [Online]. Available: http://arxiv.org/abs/2304.13998

[28] A. Garchitorena et al., “Expanding community case management of malaria to all ages can improve universal access to malaria diagnosis and treatment: results from a cluster randomized trial in Madagascar,” BMC Med., vol. 22, no. 1, Dec. 2024, doi: 10.1186/s12916-024-03441-9.

[29] Z. Wang et al., “ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations,” Sci. Rep., vol. 14, no. 1, Dec. 2024, doi: 10.1038/s41598-024-69214-9.

[30] M. Arjmandi, M. Fattahi, M. Motevassel, and H. Rezaveisi, “Evaluating algorithms of decision tree, support vector machine and regression for anode side catalyst data in proton exchange membrane water electrolysis,” Sci. Rep., vol. 13, no. 1, Dec. 2023, doi: 10.1038/s41598-023-47174-w.

[31] Z. Ali et al., “Deep Learning for Medication Recommendation: A Systematic Survey,” Mar. 01, 2023, MIT Press Journals. doi: 10.1162/dint_a_00197.

[32] K. ei Sada et al., “Development and validation of data-driven, decision tree–based algorithms for identifying Behçet’s disease in claims data,” Int. J. Med. Inform., vol. 209, Apr. 2026, doi: 10.1016/j.ijmedinf.2026.106266.

[33] S. Rahmah Jabir, H. Azis, and S. H. Mansyur, “Enhancing The Quality of College Decisions Through Decision Tree and Random Forest Models.”

[34] X. Zhu et al., “Escitalopram treatment for patients with major depressive disorder: decision trees for treatment algorithm,” J. Psychiatr. Res., vol. 195, pp. 284–290, Apr. 2026, doi: 10.1016/j.jpsychires.2026.02.001.

[35] M. A. Bouke, A. Abdullah, S. H. ALshatebi, and M. T. Abdullah, “E2IDS: An Enhanced Intelligent Intrusion Detection System Based On Decision Tree Algorithm,” Journal of Applied Artificial Intelligence, vol. 3, no. 1, pp. 1–16, Jun. 2022, doi: 10.48185/jaai.v3i1.450.

[36] F. Wang, J. Chu, L. Shen, and S. Chang, “MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models,” BMC Biol., vol. 23, no. 1, Dec. 2025, doi: 10.1186/s12915-025-02356-y.