Classification of Lontara Script Using K-NN Algorithm, Decision Tree, and Random Forest Based on Hu Moments and Canny Segmentation

Berlian Septiani; Tasrif Hasanuddin; Wistiani Astuti

doi:10.56705/ijodas.v6i2.281

Authors

Berlian Septiani Universitas Muslim Indonesia
Tasrif Hasanuddin Universitas Muslim Indonesia
Wistiani Astuti Universitas Muslim Indonesia

DOI:

https://doi.org/10.56705/ijodas.v6i2.281

Keywords:

Aksara Lontara, K-Nearest Neighbors (KNN), Decision Tree, Random Forest, Hu Moments, Canny Segmentation

Abstract

Lontara script is a traditional writing system of the Bugis-Makassar people in South Sulawesi, used to write the Bugis, Makassar, and Mandar languages. This system is based on an abugida, in which each letter represents a consonant with an inherent vowel. It was once used to record history, customary law, and literature, but its use has declined due to the influence of the Latin alphabet. Today, the Lontara script is preserved through education and digitization as part of the cultural heritage of the Indonesian archipelago. In this article, the researchers attempt to use a dataset of handwritten Lontara Bugis-Makassar characters. The process begins with the collection of character datasets, which are then processed through Canny segmentation and Hu Moment feature extraction to obtain a representation of the shape that is invariant to rotation and scale. The processed data was divided into training and testing data, then classified using the K-NN, Decision Tree, and Random Forest algorithms. The results showed that the KNN algorithm with 6 neighbors achieved the highest accuracy, precision, and recall of 98%. The Decision Tree algorithm achieved an accuracy of 96.67%, precision of 96.22%, recall of 95.33%, and an F1-score of 95.98%. Meanwhile, Random Forest showed an accuracy of 96.67%, precision of 96.34%, recall of 96%, and an F1-score of 95.98%.

Downloads

Download data is not yet available.

References

N. Moham, F. A. Dwiyanto, H. S. Pakpahan, I. Islamiyah, and H. J. Setyadi, “Pengenalan Karakter Tulisan Menggunakan Metode Backpropagation Neural Network,” Sains, Apl. Komputasi dan Teknol. Inf., vol. 1, no. 2, p. 14, 2019, doi: 10.30872/jsakti.v1i2.2601.

R. D. Nurfita, “Implementasi Deep Learning Berbasis Tensorflow,” J. Emit., vol. 18, no. 01, pp. 22–27, 2018.

E. Alpaydin, “Voting over Multiple Condensed Nearest Neighbors,” Artif. Intell. Rev., vol. 11, no. 1–5, pp. 115–132, 1997, doi: 10.1007/978-94-017-2053-3_4.

A. W. Wardhana, “Penggunaan Metode Templete Matching,” Semin. Nas. Apl. Teknol. Inf. 2008 (SNATI 2008) ISSN 1907-5022 Yogyakarta, 21 Juni 2008, vol. 2008, no. Snati, pp. 47–50, 2008, [Online]. Available: https://media.neliti.com/media/publications/125725-ID-penggunaan-metode-templete-matching-untu.pdf

R. J. Samworth, “Optimal weighted nearest neighbour classifiers,” Ann. Stat., vol. 40, no. 5, pp. 2733–2763, 2012, doi: 10.1214/12-AOS1049.

D. Sebagai, S. Satu, U. Memperoleh, and G. Sarjana, “Aplikasi Pengenalan Pola Tulisan Tangan Angka Arabic ( Indian ) Menggunakan Metode Connected Component Labeling Dan Template Matching,” 2016.

E. A. S. *Email Roni Akbar*, “S Tudi a Nalisis P Engenalan P Ola T Ulisan T Angan a Ngka a Rabic ( I Ndian ) M Enggunakan M Etode K- N Earest N Eighbors Dan C Onnected C Omponent L Abeling a Nalysis S Tudy of P Attern R Ecognition a Rabic ( I Ndian ) H Andwriting U Se the,” vol. 12, no. 2, pp. 45–51, 2016.

V. Çetin and O. Yıldız, “A comprehensive review on data preprocessing techniques in data analysis,” Pamukkale Univ. J. Eng. Sci., vol. 28, no. 2, pp. 299–312, 2022, doi: 10.5505/pajes.2021.62687.

A. Aisyah and S. Anraeni, “Analisis Penerapan Metode K-Nearest Neighbor (K-NN) pada Dataset Citra Penyakit Malaria,” Indones. J. Data Sci., vol. 3, no. 1, pp. 17–29, 2022, doi: 10.56705/ijodas.v3i1.22.

S. Anraeni, E. R. Melani, and H. Herman, “Ripeness Identification of Chayote Fruits using HSI and LBP Feature Extraction with KNN Classification,” Ilk. J. Ilm., vol. 14, no. 2, pp. 150–159, 2022, doi: 10.33096/ilkom.v14i2.1153.150-159.

T. P. Prathibha and P. M. Arabi, “Computer Aided Classification of Lung Cancer, Ground Glass Lung and Pulmonary Fibrosis Using Machine Learning and KNN Classifier,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 7, pp. 1145–1151, 2024, doi: 10.14569/IJACSA.2024.01507111.

M. R. Amiarrahman and T. Handhika, “Analisis dan Implementasi Algoritma Klasifikasi Random Forest Dalam Pengenalan Bahasa Isyarat Indonesia (BISINDO),” Semin. Nas. Inov. Teknol., pp. 83–88, 2018.

Al Danny Rian Wibisono, Syahrul Hidayat, Humam Maulana Tsubasanofa Ramadhan, and Eva Yulia Puspaningrum, “Comparison of K-Nearest Neighbor and Decision Tree Methods using Principal Component Analysis Technique in Heart Disease Classification,” Indones. J. Data Sci., vol. 4, no. 2, pp. 90–100, 2023, doi: 10.56705/ijodas.v4i2.70.

M. D. Adane, J. K. Deku, and E. K. Asare, “Performance Analysis of Machine Learning Algorithms in Prediction of Student Academic Performance,” J. Adv. Math. Comput. Sci., vol. 38, no. 5, pp. 74–86, 2023, doi: 10.9734/jamcs/2023/v38i51762.

W. Apriliah, I. Kurniawan, M. Baydhowi, and T. Haryati, “Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest,” Sistemasi, vol. 10, no. 1, p. 163, 2021, doi: 10.32520/stmsi.v10i1.1129.

M. Shanbehzadeh, H. Kazemi-Arpanahi, M. Bolbolian Ghalibaf, and A. Orooji, “Performance evaluation of machine learning for breast cancer diagnosis: A case study,” Informatics Med. Unlocked, vol. 31, no. March, p. 101009, 2022, doi: 10.1016/j.imu.2022.101009.

I. P. Adi Pratama, E. S. Jullev Atmadji, D. A. Purnamasar, and E. Faizal, “Evaluating the Performance of Voting Classifier in Multiclass Classification of Dry Bean Varieties,” Indones. J. Data Sci., vol. 5, no. 1, pp. 23–29, 2024, doi: 10.56705/ijodas.v5i1.124.

J. Basavaiah and A. Arlene Anthony, “Tomato Leaf Disease Classification using Multiple Feature Extraction Techniques,” Wirel. Pers. Commun., vol. 115, no. 1, pp. 633–651, 2020, doi: 10.1007/s11277-020-07590-x.

X. Ji, H. Guo, and M. Hu, “Features Extraction and Classification of Wood Defect Based on Hu Invariant Moment and Wavelet Moment and BP Neural Network,” in Proceedings of the 12th International Symposium on Visual Information Communication and Interaction, in VINCI ’19. New York, NY, USA: Association for Computing Machinery, 2019. doi: 10.1145/3356422.3356459.

C. D. Suhendra, E. Najwaini, E. Maria, and E. Faizal, “A Machine Learning Perspective on Daisy and Dandelion Classification: Gaussian Naive Bayes with Sobel,” Indones. J. Data Sci., vol. 4, no. 3, pp. 151–159, 2023, doi: 10.56705/ijodas.v4i3.112.

S. K. Jadwaa, “X-Ray Lung Image Classification Using a Canny Edge Detector,” J. Electr. Comput. Eng., vol. 2022, 2022, doi: 10.1155/2022/3081584.

F. Dwi Astuti and F. Nova Lenti, “Implementasi SMOTE untuk mengatasi Imbalance Class pada Klasifikasi Car Evolution menggunakan K-NN,” J. JUPITER, vol. 13, no. 1, pp. 89–98, 2021.

E. Priyanto, E. I. Sela, L. A. Latumakulita, and N. Islam, “Decision Tree C4.5 Performance Improvement using Synthetic Minority Oversampling Technique (SMOTE) and K-Nearest Neighbor for Debtor Eligibility Evaluation,” Ilk. J. Ilm., vol. 15, no. 2, pp. 373–381, 2023, [Online]. Available: https://jurnal.fikom.umi.ac.id/index.php/ILKOM/article/view/1676

Nurul A’ayunnisa, Y. Salim, and H. Azis, “Analisis Performa Metode Gaussian Naïve Bayes untuk Klasifikasi Citra Tulisan Tangan Karakter Arab,” Indones. J. Data Sci., vol. 3, no. 3, pp. 115–121, 2022, doi: 10.56705/ijodas.v3i3.54.

H. Azis, Nirmala, L. Syafie, Herman, F. Fattah, and T. Hasanuddin, “Unveiling Algorithm Classification Excellence: Exploring Calendula and Coreopsis Flower Datasets with Varied Segmentation Techniques,” in 2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM), Jan. 2024, pp. 1–7. doi: 10.1109/IMCOM60618.2024.10418246.

Herman, H. Nasir, M. N. Megat Mohamed Noor, T. Hasanuddin, D. Indra, and H. B. Lumentut, “Exploration of CNN Parameters to Measure Performance of LeNet-5 Architecture in Toraja Carving Classification,” in 2024 IEEE 8th International Conference on Signal and Image Processing Applications (ICSIPA), 2024, pp. 1–6. doi: 10.1109/ICSIPA62061.2024.10686353.

@article{karsana2023single, title={Single-L. and M.-L. T. C. using A. and C. with N. B. and SVM}, M. M. N. Author={Karsana, and }, “No Title”.

S. N. Yuliani, H. Harlinda, and W. Astuti, “Sistem Pendukung Keputusan Diagnosa Penyakit Tanaman Jeruk Bali menggunakan Metode Topsis di Desa Padang Lampe Kabupaten Pangkep,” Bul. Sist. Inf. dan Teknol. Islam, vol. 3, no. 4, pp. 311–323, 2022, doi: 10.33096/busiti.v3i4.1467.

W. Astuti, D. P. I. Putri, A. P. Wibawa, Y. Salim, Purnawansyah, and A. Ghosh, “Predicting Frequently Asked Questions (FAQs) on the COVID-19 Chatbot using the DIET Classifier,” 3rd 2021 East Indones. Conf. Comput. Inf. Technol. EIConCIT 2021, no. December, pp. 25–29, 2021, doi: 10.1109/EIConCIT50028.2021.9431913.

Purnawansyah, N. A. Supriadi, A. R. Manga, R. Adawiyah, Harlinda, and T. Hasanuddin, “Application of Ensemble Machine Learning for DDoS Detection in Complex Network Environments,” in 2025 19th International Conference on Ubiquitous Information Management and Communication (IMCOM), 2025, pp. 1–7. doi: 10.1109/IMCOM64595.2025.10857516.