Improving Part-of-Speech Tagging with Relative Positional Encoding in Transformer Models and Basic Rules
DOI:
https://doi.org/10.56705/ijodas.v6i1.184Keywords:
Corrective rule-based Module, NLP, Part-of-Speech tagging, Self-attention, Transformer, Word EmbeddingAbstract
Introduction: Part-of-speech (POS) tagging plays a pivotal role in natural language processing (NLP) tasks such as semantic parsing and machine translation. However, challenges with ambiguous and unknown words, along with limitations of absolute positional encoding in transformers, often affect tagging accuracy. This study proposes an enhanced POS tagging model integrating relative positional encoding and a rule-based correction module. Methods: The model utilizes a transformer-based architecture equipped with relative positional encoding to better capture token dependencies. Word embeddings, POS tag embeddings, and relative position embeddings are combined and processed through a multi-head attention mechanism. Following the initial classification by the transformer, a corrective rule-based module is applied to refine misclassified tokens. The approach was evaluated using the Groningen Meaning Bank (GMB) dataset, comprising over 1.3 million tokens. Results: The transformer model achieved an accuracy of 98.50% prior to rule-based corrections. After applying the rule-based module, overall accuracy increased to 99.68%, outperforming a comparable model using absolute positional encoding (98.60%). Additional evaluation metrics, including a precision of 0.92, recall of 0.89, and F1-score of 0.90, further validate the model’s effectiveness. Conclusions: Incorporating relative positional encoding significantly enhances the transformer’s contextual understanding and performance in POS tagging. The addition of a rule-based correction module improves classification accuracy, especially for linguistically ambiguous tokens. The proposed hybrid model demonstrates robust performance and adaptability, offering a promising direction for future multilingual POS tagging systems.
Downloads
References
D. Jurafsky and J. H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd ed, 2019.
S. G. Withanage and T. Silva, "A Stochastic Part of Speech Tagger for the Sinhala Language Based on Social Media Data Mining," in Proceedings of the 20th International Conference on Advances in ICT for Emerging Regions (ICTer), Colombo, Sri Lanka, 2020, pp. 137-142, doi: 10.1109/ICTer51097.2020.9325456.
H. Li, H. Mao, and J. Wang, "Part of Speech Tagging with Rule-Based Data Preprocessing and Transformer," Electronics, vol. 11, no. 56, 2022, doi: 10.3390/electronics11010056.
K. S. Anbananthen, J. K. Krishnan, M. S. Sayeed, and P. Muniapan, "Comparison of Stochastic and Rule-Based POS Tagging on Malay Online Text," American Journal of Applied Sciences, vol. 14, no. 9, pp. 843-851, 2017, doi: 10.3844/ajassp.2017.843.851.
A. Singh, C. Verma, S. Seal, and V. Singh, "Development of Part of Speech Tagger Using Deep Learning," International Journal of Engineering and Advanced Technology (IJEAT), vol. 9, no. 1, 2019, doi: 10.35940/ijeat.A1531.109119.
P. Lohe and V. Pandey, "Survey on Part of Speech Tagger for Hindi Language Using Rule-Based Approach," International Research Journal of Engineering and Technology (IRJET), vol. 7, no. 11, 2020.
Chiche and Yitagesu, "Part of Speech Tagging: A Systematic Review of Deep Learning and Machine Learning Approaches," Journal of Big Data, vol. 9, no. 10, 2022, doi: 10.1186/s40537-022-00561-y.
X. Yang, Y. Liu, D. Xie, X. Wang, and N. Balasubramanian, "Latent part-of-speech sequences for neural machine translation," 2019.
Y. Tan, X. Wang, and T. Jia, "From syntactic structure to semantic relationship: Hypernym extraction from definitions by recurrent neural networks using the part of speech information," in Proceedings of the 19th International Semantic Web Conference, Athens, Greece, Nov. 2020.
S. Warjri, P. Pakray, S. A. Lyngdoh, and A. K. Maji, "Part-of-Speech (POS) Tagging Using Deep Learning-Based Approaches on the Designed Khasi POS Corpus," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 21, no. 3, 2022, doi: 10.1145/3488381.
J. Awwalu, S. E. Abdullahi, and A. E. Evwiekpaefe, "Parts of Speech Tagging: A Review of Techniques," FUDMA Journal of Sciences (FJS), vol. 4, no. 2, 2020, doi: 10.33003/fjs-2020-0402-325.
B. Pham, "Parts of Speech Tagging: Rule-Based," 2020.
L. Galiano and A. Semeraro, "Part-of-Speech and Pragmatic Tagging of a Corpus of Film Dialogue: A Pilot Study," Corpus Pragmatics, vol. 7, pp. 17–39, 2023, doi: 10.1007/s41701-022-00132-9.
S. Tyagi and G. S. Mishra, "Statistical Analysis of Part of Speech (POS) Tagging Algorithms for English Corpus," International Journal of Advance Research, Ideas and Innovations in Technology, vol. 2, no. 3, 2016.
G. Kaur and D. Sharma, "Development of Stochastic Part of Speech Tagger for Morphologically Rich Languages," International Journal of Research in Engineering and Science (IJRES), vol. 9, no. 7, 2021.
B. F. Shirko, "Part of Speech Tagging for Wolaita Language using Transformation-Based Learning (TBL) Approach," International Journal of Engineering Science and Computing (IJESC), vol. 10, no. 9, 2020.
P. Wang, Y. Qian, F. K. Soong, L. He, and H. Zhao, "Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network”, 2015.
W. Ling, T. Luís, L. Marujo, R. F. Astudillo, S. Amir, C. Dyer, A. W. Black, and I. Trancoso, "Finding function in form: Compositional character models for open vocabulary Word Representation," 2015.
B. Plank, A. Søgaard, and Y. Goldberg, "Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss," 2016.
H. Wang, J. Yang, and Y. Zhang, "From genesis to creole language: Transfer learning for Singlish universal dependencies parsing and POS tagging," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 18, no. 4, 2019, doi: 10.1145/3379142.
P. Qi, T. Dozat, and C. Manning, "Stanford’s Graph-Based Neural Dependency Parser at the CoNLL 2017 Shared Task," in Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, BC, Canada, 2017, pp. 20-30. doi: 10.18653/v1/K17-3002
P. Qi, T. Dozat, Y. Zhang, and C. D. Manning, "Universal dependency parsing from scratch," 2019.
S. Warjri, P. Pakray, S. A. Lyngdoh, and A. K. Maji, "Part-of-speech (POS) tagging using Conditional Random Field (CRF) model for Khasi corpora," International Journal of Speech Technology, vol. 24, no. 3, pp. 415–423, 2021, doi: 10.1007/s10772-021-09854-6.
W. AlKhwiter and N. Al-Twairesh, "Part-of-speech tagging for Arabic tweets using CRF and Bi-LSTM," Computational Languages, vol. 2021, no. 65, 2021, doi: 10.1016/j.csl.2020.101155.
R. Dixit, "A Comprehensive Review of Transformer Models and Their Implementation in Machine Translation Specifically on Indian Regional Languages," SSRN, 2023.
Z. Niu, G. Zhong, and H. Yu, "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48–62, 2021, doi: 10.1016/j.neucom.2021.03.091.
S. R. Choi and M. Lee, "Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review," Biology (Basel), vol. 12, no. 7, p. 1033, Jul. 2023, doi: 10.3390/biology12071033.
N. Patwardhan, S. Marrone, and C. Sansone, "Transformers in the Real World: A Survey on NLP Applications," Information, vol. 14, no. 4, p. 242, 2023, doi: 10.3390/info14040242.
T. Lin, Y. Wang, X. Liu, and X. Qiu, "A Survey of Transformers," AI Open, vol. 3, pp. 1-15, 2022, doi: 10.1016/j.aiopen.2022.10.001.
J. Bos, V. Basile, K. Evang, N. J. Venhuizen, and J. Bjerva, "The Groningen Meaning Bank," in Handbook of Linguistic Annotation, N. Ide and J. Pustejovsky, Eds., Dordrecht, Netherlands: Springer, 2017, pp. 463–496.
S. A. Hicks, I. Strümke, V. Thambawita, M. Hammou, M. A. Riegler, P. Halvorsen, and S. Parasa, "On Evaluation Metrics for Medical Applications of Artificial Intelligence," Scientific Reports, vol. 12, no. 1, 2022, doi: 10.1038/s41598-022-09954-8.
H. Dalianis, "Evaluation Metrics and Evaluation," in Clinical Text Mining. Cham, Switzerland: Springer, 2018, pp. 6, doi: 10.1007/978-3-319-78503-5_6.
Downloads
Published
Issue
Section
License
Authors retain copyright and full publishing rights to their articles. Upon acceptance, authors grant Indonesian Journal of Data and Science a non-exclusive license to publish the work and to identify itself as the original publisher.
Self-archiving. Authors may deposit the submitted version, accepted manuscript, and version of record in institutional or subject repositories, with citation to the published article and a link to the version of record on the journal website.
Commercial permissions. Uses intended for commercial advantage or monetary compensation are not permitted under CC BY-NC 4.0. For permissions, contact the editorial office at ijodas.journal@gmail.com.
Legacy notice. Some earlier PDFs may display “Copyright © [Journal Name]” or only a CC BY-NC logo without the full license text. To ensure clarity, the authors maintain copyright, and all articles are distributed under CC BY-NC 4.0. Where any discrepancy exists, this policy and the article landing-page license statement prevail.










