Fine-Tuning a Large Language Model on Vertex AI for a New Student Registration Chatbot at Universitas Muhammadiyah Makassar
DOI:
https://doi.org/10.56705/ijodas.v7i1.341Keywords:
Chatbot, Large Language Model (LLM), Fine-tuning, Google Cloud Vertex AI, BLEU, ROUGE-L, Customer Satisfaction Score (CSAT), Student AdmissionAbstract
This study addresses the limitations of manual admission services at Universitas Muhammadiyah Makassar, which often result in delayed and inconsistent information delivery. To overcome these challenges, an institution-specific chatbot was developed by fine-tuning the Gemini 2.5 Flash model on the Google Cloud Vertex AI platform. The model was trained using a curated domain-specific dataset of 1,430 question–answer pairs derived from official documents and frequently asked questions. The fine-tuning process employed supervised learning to enhance contextual relevance and response accuracy. System performance was evaluated using automated text quality metrics, achieving an average BLEU score of 0.23526 and a ROUGE-L Recall score of 0.53424, indicating satisfactory lexical and semantic similarity. Furthermore, a user acceptance evaluation involving 52 respondents yielded a Customer Satisfaction Score (CSAT) of 84.2%, reflecting high user satisfaction. These results demonstrate that fine-tuning a Large Language Model (LLM) for specific institutional needs effectively improves both response quality and service reliability. Ultimately, this approach offers a practical and scalable solution for modernizing student admission services in higher education, ensuring that prospective students receive accurate information in a timely and efficient manner.
Downloads
References
[1] Fahmi Yusron Fiddin, A. Komarudin, and M. Melina, “Chatbot Informasi Penerimaan Mahasiswa Baru Menggunakan Metode FastText dan LSTM,” J. Appl. Comput. Sci. Technol., vol. 5, no. 1, pp. 33–39, 2024, doi: 10.52158/jacost.v5i1.648.
[2] I. A. E. Z. & N. A. M. J. C. Suardi, D. Anggreani, A.P. Wibawa, N. Murtadlo, “Asking a chatbot for food ingredients halal status,” in Halal Development: Trends, Opportunities and Challenges, 2021, pp. 14–20. doi: 10.1201/9781003189282-6.
[3] C. Ehrett, S. Hegde, K. Andre, D. Liu, and T. Wilson, “Leveraging Open-Source Large Language Models for Data Augmentation to Improve Text Classification in Surveys of Medical Staff (Preprint),” JMIR Med. Educ., vol. 10, 2023, doi: 10.2196/51433.
[4] A. Vaswani et al., “Attention is all you need,” Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Nips, pp. 5999–6009, 2017.
[5] T. B. Brown et al., “Language models are few-shot learners,” Adv. Neural Inf. Process. Syst., vol. 2020-Decem, 2020.
[6] S. Bubeck et al., “Sparks of Artificial General Intelligence: Early experiments with GPT-4,” 2023.
[7] J. Li, T. Tang, W. X. Zhao, and J. R. Wen, “Pretrained Language Models for Text Generation: A Survey,” IJCAI Int. Jt. Conf. Artif. Intell., vol. 1, no. 1, pp. 4492–4499, 2021, doi: 10.24963/ijcai.2021/612.
[8] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” Adv. Neural Inf. Process. Syst., vol. 2020-Decem, no. NeurIPS, 2020.
[9] P. Sarzaeim, Q. H. Mahmoud, and A. Azim, “A Framework for LLM-Assisted Smart Policing System,” IEEE Access, vol. 12, pp. 74915–74929, 2024, doi: 10.1109/ACCESS.2024.3404862.
[10] C. Ziems, W. Held, O. Shaikh, J. Chen, Z. Zhang, and D. Yang, “Can Large Language Models Transform Computational Social Science?,” Comput. Linguist., vol. 50, no. 1, pp. 237–291, 2023, doi: 10.1162/coli_a_00502.
[11] OpenAI et al., “GPT-4 Technical Report,” vol. 4, pp. 1–100, 2024.
[12] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” Proc. Annu. Meet. Assoc. Comput. Linguist., vol. 2002-July, no. July, pp. 311–318, 2002.
[13] C.-Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries,” in Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Association for Computational Linguistics, 2004, pp. 74–81. doi: 10.1253/jcj.34.1213.
[14] H. Shen, H. Jin, Á. A. Cabrera, A. Perer, H. Zhu, and J. I. Hong, “Designing Alternative Representations of Confusion Matrices to Support Non-Expert Public Understanding of Algorithm Performance,” Proc. ACM Human-Computer Interact., vol. 4, no. CSCW2, 2020, doi: 10.1145/3415224.
[15] M. Heydarian, T. E. Doyle, and R. Samavi, “MLCM: Multi-Label Confusion Matrix,” IEEE Access, vol. 10, pp. 19083–19095, 2022, doi: 10.1109/ACCESS.2022.3151048.
[16] W. Alkishri, J. H. Yousif, Y. N. Al Husaini, and M. Al-Bahri, “Conversational AI in Education: A General Review of Chatbot Technologies and Challenges,” J. Logist. Informatics Serv. Sci., vol. 12, no. 3, pp. 264–282, 2025, doi: 10.33168/JLISS.2025.0316.
[17] W. Xia, C. Qin, and E. Hazan, “Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning,” 2024.
[18] E. Hu et al., “Lora: Low-Rank Adaptation of Large Language Models,” ICLR 2022 - 10th Int. Conf. Learn. Represent., pp. 1–26, 2022.
[19] J. Van Herck et al., “Assessment of fine-tuned large language models for real-world chemistry and material science applications,” Chem. Sci., pp. 670–684, 2024, doi: 10.1039/d4sc04401k.
[20] L. Luo et al., “Taiyi: A bilingual fine-tuned large language model for diverse biomedical tasks,” J. Am. Med. Informatics Assoc., vol. 31, no. 9, pp. 1865–1874, 2024, doi: 10.1093/jamia/ocae037.
[21] W. Zhang et al., “Fine-tuning large language models for chemical text mining,” Chem. Sci., vol. 15, no. 27, pp. 10600–10611, 2024, doi: 10.1039/d4sc00924j.
[22] Y. Li, Y. Du, K. Zhou, J. Wang, W. X. Zhao, and J. R. Wen, “Evaluating Object Hallucination in Large Vision-Language Models,” EMNLP 2023 - 2023 Conf. Empir. Methods Nat. Lang. Process. Proc., no. Table 1, pp. 292–305, 2023, doi: 10.18653/v1/2023.emnlp-main.20.
[23] I. O. Gallegos et al., “Bias and Fairness in Large Language Models: A Survey,” Comput. Linguist., vol. 50, no. 3, pp. 1097–1179, 2024, doi: 10.1162/coli_a_00524.
[24] H. Zhao et al., “Explainability for Large Language Models: A Survey,” ACM Trans. Intell. Syst. Technol., vol. 15, no. 2, 2024, doi: 10.1145/3639372.
[25] Y. Chai, Q. Liu, S. Wang, Y. Sun, Q. Peng, and H. Wu, “On Training Data Influence of GPT Models,” EMNLP 2024 - 2024 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 3126–3150, 2024, doi: 10.18653/v1/2024.emnlp-main.183.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Desi Anggreani, Muhyiddin A M Hayat, AHMAD FAISAL

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors retain copyright and full publishing rights to their articles. Upon acceptance, authors grant Indonesian Journal of Data and Science a non-exclusive license to publish the work and to identify itself as the original publisher.
Self-archiving. Authors may deposit the submitted version, accepted manuscript, and version of record in institutional or subject repositories, with citation to the published article and a link to the version of record on the journal website.
Commercial permissions. Uses intended for commercial advantage or monetary compensation are not permitted under CC BY-NC 4.0. For permissions, contact the editorial office at ijodas.journal@gmail.com.
Legacy notice. Some earlier PDFs may display “Copyright © [Journal Name]” or only a CC BY-NC logo without the full license text. To ensure clarity, the authors maintain copyright, and all articles are distributed under CC BY-NC 4.0. Where any discrepancy exists, this policy and the article landing-page license statement prevail.










