Emotion Recognition from Text and Gesture Generation for an Early Marriage Counseling Chatbot in Lombok Using BERT

subject Abstract

Early marriage remains a pressing issue among adolescents in Lombok, Indonesia, influenced by cultural norms, educational barriers, and economic challenges. This study develops an emotion classification and reason identification framework for a virtual counseling chatbot to support prevention efforts. Five functional emotion categories ‘Enthusiastic’, ‘Gentle’, ‘Analytical’, ‘Inspirational’, and ‘Cautionary’ were defined to capture counseling tones. The system leverages IndoBERT with a two-phase fine-tuning strategy. Phase 1 used a balanced dataset of 2,000 samples and achieved a macro F1-score of 0.95, while Phase 2 refined the model using 10,000 imbalanced pseudo-labeled samples, yielding a macro F1-score of 0.88 and improved sensitivity to minority classes. In addition, a semantic similarity-based reason identification module was implemented to classify user inputs into Education, Economy, Religion, or Culture categories, enhancing context awareness beyond simple keyword matching. Performance evaluation employed accuracy, precision, recall, and F1-score, supported by confusion matrices and training plots for generalization analysis. A descriptive emotion-to-gesture mapping was also designed to link each emotion category with static body pose visualizations, providing a conceptual basis for future multimodal applications.

Keywords: Early Marriage, Emotion Classification, Gesture Mapping, IndoBERT, Virtual Chatbot

label Categories
format_quoteCitationfile_copyCopy
[1]
Zahran Ramadhdan, A. et al. 2025. Emotion Recognition from Text and Gesture Generation for an Early Marriage Counseling Chatbot in Lombok Using BERT. Indonesian Journal on Computing (Indo-JC). 10, 1 (Oct. 2025). DOI:https://doi.org/10.21108/indojc.v10i1.9710.

document_search References

[1] R. Nabila, R. Roswiyani, and H. Satyadi, “A Literature Review of Factors Influencing Early Marriage Decisions in Indonesia,” in Proceedings of the 3rd Tarumanagara International Conference on the Applications of Social Sciences and Humanities (TICASH 2021), Atlantic Press, 2022, pp. 1392–1402. doi: 10.2991/assehr.k.220404.223.

[2] D. Fadilah, “Tinjauan Dampak Pernikahan Dini dari Berbagai Aspek,” Pamator Journal, vol. 14, no. 2, pp. 88–94, Nov. 2021, doi: 10.21107/pamator.v14i2.10590.

[3] Z. Ayudiputri, A. Nur, S. Amanda, and F. Hanifa, “Determinants of Child Marriage in Indonesia : A Systematic Review,” Journal of Community Medicine and Public Health Research, vol. 5, no. 2, pp. 216–227, Nov. 2024, doi: 10.20473/jcmphr.v5i2.45777.

[4] L. Wang et al., “CASS: Towards Building a Social-Support Chatbot for Online Health Community,” in Conference on Computer-Supported Cooperative Work & Social Computing (CSCW), Feb. 2021, pp. 1–31. doi: https://doi.org/10.48550/arXiv.2101.01583.

[5] S. Khandelwal, “SOCIAL COMPANION CHATBOT FOR HUMAN COMMUNICATION USING ML AND NLP,” International Journal of Engineering Applied Sciences and Technology, vol. 8, pp. 321–324, 2023, doi: https://doi.org/10.33564/IJEAST.2023.v08i01.048.

[6] R. E. Guingrich and M. S. A. Graziano, “Chatbots as Social Companions: How People Perceive Consciousness, Human Likeness, and Social Health Benefits in Machines,” in Oxford Intersections: AI in Society, Oxford University Press, 2025. doi: https://doi.org/10.1093/9780198945215.001.0001.

[7] G. Z. Nabiilah, I. N. Alam, E. S. Purwanto, and M. F. Hidayat, “Indonesian multilabel classification using IndoBERT embedding and MBERT classification,” International Journal of Electrical and Computer Engineering, vol. 14, no. 1, pp. 1071–1078, Feb. 2024, doi: 10.11591/ijece.v14i1.pp1071-1078.

[8] S. Prabhu, M. Moosa, and H. Misra, Multi-class Text Classification using BERT-based Active Learning. 2021. doi: 10.48550/arXiv.2104.14289.

[9] I. Ameer, N. Bölücü, M. H. F. Siddiqui, B. Can, G. Sidorov, and A. Gelbukh, “Multi-label emotion classification in texts using transfer learning,” Expert Syst Appl, vol. 213, Mar. 2023, doi: 10.1016/j.eswa.2022.118534.

[10] H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid Models for Emotion Classification and Sentiment Analysis in Indonesian Language,” Applied Computational Intelligence and Soft Computing, vol. 2024, no. 1, pp. 1–17, 2024, doi: https://doi.org/10.1155/2024/2826773.

[11] S. K. Bharti et al., “Text-Based Emotion Recognition Using Deep Learning Approach,” Comput Intell Neurosci, vol. 2022, no. 1, p. 2645381, 2022, doi: https://doi.org/10.1155/2022/2645381.

[12] U. Malik, S. Bernard, A. Pauchet, C. Chatelain, R. Picot-Clémente, and J. Cortinovis, “Pseudo-Labeling With Large Language Models for Multi-Label Emotion Classification of French Tweets,” IEEE Access, vol. 12, pp. 15902–15916, 2024, doi: 10.1109/ACCESS.2024.3354705.

[13] M. Y. Baihaqi, E. Halawa, R. A. S. Syah, A. Nurrahma, and W. Wijaya, “Emotion Classification in Indonesian Language: A CNN Approach with Hyperband Tuning,” Jurnal Buana Informatika, vol. 14, no. 02, pp. 137–146, Oct. 2023, doi: 10.24002/jbi.v14i02.7558.

[14] A. Zamsuri, S. Defit, and G. W. Nurcahyo, “Classification of Multiple Emotions in Indonesian Text Using The K-Nearest Neighbor Method,” Journal of Applied Engineering and Technological Science (JAETS), vol. 4, no. 2, pp. 1012–1021, Jun. 2023, doi: 10.37385/jaets.v4i2.1964.

[15] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds., Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.

[16] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” Association for Computational Linguistics, Dec. 2020, pp. 843–857. doi: https://doi.org/10.48550/arXiv.2009.05387.

[17] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” in Proceedings of the 28th International Conference on Computational Linguistics, International Committee on Computational Linguistics, Nov. 2020, pp. 757–770. doi: https://doi.org/10.48550/arXiv.2011.00677.

[18] C. Shaw, P. LaCasse, and L. Champagne, “Exploring emotion classification of indonesian tweets using large scale transfer learning via IndoBERT,” Soc Netw Anal Min, vol. 15, no. 1, p. 22, 2025, doi: 10.1007/s13278-025-01439-6.

[19] M. Nadas, L. Diosan, and A. Tomescu, “Synthetic Data Generation Using Large Language Models: Advances in Text and Code,” Mar. 2025, [Online]. Available: http://arxiv.org/abs/2503.14023

[20] L. Long et al., “On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey,” Jun. 2024, [Online]. Available: http://arxiv.org/abs/2406.15126

[21] Y. Li, R. Bonatti, S. Abdali, J. Wagle, and K. Koishida, “Data Generation Using Large Language Models for Text Classification: An Empirical Case Study,” Jul. 2024, [Online]. Available: http://arxiv.org/abs/2407.12813

[22] X. Li, “Recognition Characteristics of Facial and Bodily Expressions: Evidence From ERPs,” Front Psychol, vol. Volume 12-2021, 2021, doi: 10.3389/fpsyg.2021.680959.

[23] A. Diwan, R. Sunil, P. Mer, R. Mahadeva, and S. P. Patole, “Advancements in Emotion Classification via Facial and Body Gesture Analysis: A Survey,” Expert Syst, vol. 42, no. 2, p. e13759, 2025, doi: https://doi.org/10.1111/exsy.13759.

[24] C. Forceville, Visual and Multimodal Communication: Applying the Relevance Principle: Introduction. 2020. doi: 10.1093/oso/9780190845230.001.0001.

[25] U. Bhattacharya, N. Rewkowski, A. Banerjee, P. Guhan, A. Bera, and D. Manocha, “Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents,” Jul. 2021, pp. 1–10. doi: 10.1109/VR50410.2021.00037.

[26] C. A. Khairunnisa, S. A. M. Sitorus, V. D. Puspita, and I. Maryati, “Exploration of Factors Influencing Early Marriage in Adolescents: A Literature Review,” Care : Jurnal Ilmiah Ilmu Kesehatan, vol. 12, no. 2, pp. 205–214, Jul. 2024, doi: 10.33366/jc.v12i2.5694.

[27] M. Fitria, A. D. Laksono, I. M. Syahri, R. D. Wulandari, R. Matahari, and Y. Astuti, “Education role in early marriage prevention: evidence from Indonesia’s rural areas,” BMC Public Health, vol. 24, no. 1, p. 3323, 2024, doi: 10.1186/s12889-024-20775-4.

[28] D. A. R. Sojais, J. Suyanto, and H. Rustandi, “Economic, Social, and Cultural Contexts of Early Marriage in Bengkulu Province,” Jurnal Aisyah : Jurnal Ilmu Kesehatan, vol. 8, no. 2, Jun. 2023, doi: 10.30604/jika.v8i2.2047.

[29] K. D. Rahadika Diana and M. L. Khodra, “IndoSBERT: Enhancing Indonesian Sentence Embeddings with Siamese Networks Fine-tuning,” in 2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA), 2023, pp. 1–6. doi: 10.1109/ICAICTA59291.2023.10390469.

[30] E. Luthfi, Z. Yusoh, and B. Aboobaider, “Enhancing the Takhrij Al-Hadith based on Contextual Similarity using BERT Embeddings,” International Journal of Advanced Computer Science and Applications, vol. 12, Jul. 2021, doi: 10.14569/IJACSA.2021.0121133.

[31] E. Yulianti and N. Nissa, “ABSA of Indonesian customer reviews using IndoBERT: single- sentence and sentence-pair classification approaches,” Bulletin of Electrical Engineering and Informatics, vol. 13, pp. 3579–3589, Jul. 2024, doi: 10.11591/eei.v13i5.8032.

[32] J. Opitz, “A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice,” Transactions of the Association for Computational Linguistics 2024, vol. 12, pp. 820–836, Apr. 2024, doi: 10.1162/tacl_a_00675.

[33] T. Schlosser, M. Friedrich, T. Meyer, D. Kowerko, and J. Professorship, A Consolidated Overview of Evaluation and Performance Metrics for Machine Learning and Computer Vision. 2024. doi: 10.13140/RG.2.2.14331.69928.

[34] O. Rainio, J. Teuho, and R. Klén, “Evaluation metrics and statistical tests for machine learning,” Sci Rep, vol. 14, no. 1, p. 6086, Dec. 2024, doi: 10.1038/s41598-024-56706-x.

Downloads

Download data is not yet available.