Improving Part-of-Speech Tagging with Relative Positional Encoding in Transformer Models and Basic Rules

Abdukarim  Mohammad; Mohammed  Abdullahi; Jerome Aondongu Achir

doi:10.56705/ijodas.v6i1.184

Authors

Abdukareem Mohammad Ahmadu Bello University
Mohammed Abdullahi Ahmadu Bello University
Jerome Aondongu Achir Josep Sarwuan Tarka Makurdi

DOI:

https://doi.org/10.56705/ijodas.v6i1.184

Keywords:

Corrective rule-based Module, NLP, Part-of-Speech tagging, Self-attention, Transformer, Word Embedding

Abstract

Introduction: Part-of-speech (POS) tagging plays a pivotal role in natural language processing (NLP) tasks such as semantic parsing and machine translation. However, challenges with ambiguous and unknown words, along with limitations of absolute positional encoding in transformers, often affect tagging accuracy. This study proposes an enhanced POS tagging model integrating relative positional encoding and a rule-based correction module. Methods: The model utilizes a transformer-based architecture equipped with relative positional encoding to better capture token dependencies. Word embeddings, POS tag embeddings, and relative position embeddings are combined and processed through a multi-head attention mechanism. Following the initial classification by the transformer, a corrective rule-based module is applied to refine misclassified tokens. The approach was evaluated using the Groningen Meaning Bank (GMB) dataset, comprising over 1.3 million tokens. Results: The transformer model achieved an accuracy of 98.50% prior to rule-based corrections. After applying the rule-based module, overall accuracy increased to 99.68%, outperforming a comparable model using absolute positional encoding (98.60%). Additional evaluation metrics, including a precision of 0.92, recall of 0.89, and F1-score of 0.90, further validate the model’s effectiveness. Conclusions: Incorporating relative positional encoding significantly enhances the transformer’s contextual understanding and performance in POS tagging. The addition of a rule-based correction module improves classification accuracy, especially for linguistically ambiguous tokens. The proposed hybrid model demonstrates robust performance and adaptability, offering a promising direction for future multilingual POS tagging systems.

Downloads

Download data is not yet available.

References

D. Jurafsky and J. H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd ed, 2019.

S. G. Withanage and T. Silva, "A Stochastic Part of Speech Tagger for the Sinhala Language Based on Social Media Data Mining," in Proceedings of the 20th International Conference on Advances in ICT for Emerging Regions (ICTer), Colombo, Sri Lanka, 2020, pp. 137-142, doi: 10.1109/ICTer51097.2020.9325456.

H. Li, H. Mao, and J. Wang, "Part of Speech Tagging with Rule-Based Data Preprocessing and Transformer," Electronics, vol. 11, no. 56, 2022, doi: 10.3390/electronics11010056.

K. S. Anbananthen, J. K. Krishnan, M. S. Sayeed, and P. Muniapan, "Comparison of Stochastic and Rule-Based POS Tagging on Malay Online Text," American Journal of Applied Sciences, vol. 14, no. 9, pp. 843-851, 2017, doi: 10.3844/ajassp.2017.843.851.

A. Singh, C. Verma, S. Seal, and V. Singh, "Development of Part of Speech Tagger Using Deep Learning," International Journal of Engineering and Advanced Technology (IJEAT), vol. 9, no. 1, 2019, doi: 10.35940/ijeat.A1531.109119.

P. Lohe and V. Pandey, "Survey on Part of Speech Tagger for Hindi Language Using Rule-Based Approach," International Research Journal of Engineering and Technology (IRJET), vol. 7, no. 11, 2020.

Chiche and Yitagesu, "Part of Speech Tagging: A Systematic Review of Deep Learning and Machine Learning Approaches," Journal of Big Data, vol. 9, no. 10, 2022, doi: 10.1186/s40537-022-00561-y.

X. Yang, Y. Liu, D. Xie, X. Wang, and N. Balasubramanian, "Latent part-of-speech sequences for neural machine translation," 2019.

Y. Tan, X. Wang, and T. Jia, "From syntactic structure to semantic relationship: Hypernym extraction from definitions by recurrent neural networks using the part of speech information," in Proceedings of the 19th International Semantic Web Conference, Athens, Greece, Nov. 2020.

S. Warjri, P. Pakray, S. A. Lyngdoh, and A. K. Maji, "Part-of-Speech (POS) Tagging Using Deep Learning-Based Approaches on the Designed Khasi POS Corpus," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 21, no. 3, 2022, doi: 10.1145/3488381.

J. Awwalu, S. E. Abdullahi, and A. E. Evwiekpaefe, "Parts of Speech Tagging: A Review of Techniques," FUDMA Journal of Sciences (FJS), vol. 4, no. 2, 2020, doi: 10.33003/fjs-2020-0402-325.

B. Pham, "Parts of Speech Tagging: Rule-Based," 2020.

L. Galiano and A. Semeraro, "Part-of-Speech and Pragmatic Tagging of a Corpus of Film Dialogue: A Pilot Study," Corpus Pragmatics, vol. 7, pp. 17–39, 2023, doi: 10.1007/s41701-022-00132-9.

S. Tyagi and G. S. Mishra, "Statistical Analysis of Part of Speech (POS) Tagging Algorithms for English Corpus," International Journal of Advance Research, Ideas and Innovations in Technology, vol. 2, no. 3, 2016.

G. Kaur and D. Sharma, "Development of Stochastic Part of Speech Tagger for Morphologically Rich Languages," International Journal of Research in Engineering and Science (IJRES), vol. 9, no. 7, 2021.

B. F. Shirko, "Part of Speech Tagging for Wolaita Language using Transformation-Based Learning (TBL) Approach," International Journal of Engineering Science and Computing (IJESC), vol. 10, no. 9, 2020.

P. Wang, Y. Qian, F. K. Soong, L. He, and H. Zhao, "Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network”, 2015.

W. Ling, T. Luís, L. Marujo, R. F. Astudillo, S. Amir, C. Dyer, A. W. Black, and I. Trancoso, "Finding function in form: Compositional character models for open vocabulary Word Representation," 2015.

B. Plank, A. Søgaard, and Y. Goldberg, "Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss," 2016.

H. Wang, J. Yang, and Y. Zhang, "From genesis to creole language: Transfer learning for Singlish universal dependencies parsing and POS tagging," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 18, no. 4, 2019, doi: 10.1145/3379142.

P. Qi, T. Dozat, and C. Manning, "Stanford’s Graph-Based Neural Dependency Parser at the CoNLL 2017 Shared Task," in Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, BC, Canada, 2017, pp. 20-30. doi: 10.18653/v1/K17-3002

P. Qi, T. Dozat, Y. Zhang, and C. D. Manning, "Universal dependency parsing from scratch," 2019.

S. Warjri, P. Pakray, S. A. Lyngdoh, and A. K. Maji, "Part-of-speech (POS) tagging using Conditional Random Field (CRF) model for Khasi corpora," International Journal of Speech Technology, vol. 24, no. 3, pp. 415–423, 2021, doi: 10.1007/s10772-021-09854-6.

W. AlKhwiter and N. Al-Twairesh, "Part-of-speech tagging for Arabic tweets using CRF and Bi-LSTM," Computational Languages, vol. 2021, no. 65, 2021, doi: 10.1016/j.csl.2020.101155.

R. Dixit, "A Comprehensive Review of Transformer Models and Their Implementation in Machine Translation Specifically on Indian Regional Languages," SSRN, 2023.

Z. Niu, G. Zhong, and H. Yu, "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48–62, 2021, doi: 10.1016/j.neucom.2021.03.091.

S. R. Choi and M. Lee, "Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review," Biology (Basel), vol. 12, no. 7, p. 1033, Jul. 2023, doi: 10.3390/biology12071033.

N. Patwardhan, S. Marrone, and C. Sansone, "Transformers in the Real World: A Survey on NLP Applications," Information, vol. 14, no. 4, p. 242, 2023, doi: 10.3390/info14040242.

T. Lin, Y. Wang, X. Liu, and X. Qiu, "A Survey of Transformers," AI Open, vol. 3, pp. 1-15, 2022, doi: 10.1016/j.aiopen.2022.10.001.

J. Bos, V. Basile, K. Evang, N. J. Venhuizen, and J. Bjerva, "The Groningen Meaning Bank," in Handbook of Linguistic Annotation, N. Ide and J. Pustejovsky, Eds., Dordrecht, Netherlands: Springer, 2017, pp. 463–496.

S. A. Hicks, I. Strümke, V. Thambawita, M. Hammou, M. A. Riegler, P. Halvorsen, and S. Parasa, "On Evaluation Metrics for Medical Applications of Artificial Intelligence," Scientific Reports, vol. 12, no. 1, 2022, doi: 10.1038/s41598-022-09954-8.

H. Dalianis, "Evaluation Metrics and Evaluation," in Clinical Text Mining. Cham, Switzerland: Springer, 2018, pp. 6, doi: 10.1007/978-3-319-78503-5_6.