Detection and Classification of Cognitive Distortions in Mental Health Texts Using a Hybrid Natural Language Processing Approach

group

Authors

  • Elizabeth Piscelia Kusuma Universitas Pignatelli Triputra
  • Aan Shandy Rahesa Universitas Pignatelli Triputra
  • Christin Yulianti Universitas Pignatelli Triputra
  • Samuel Ardhian Trisunu Universitas Pignatelli Triputra
Issue 2026
Published 16 February 2026
Section Articles
description PDF
subject

Abstract

This study develops a hybrid natural language processing system to detect cognitive distortions in Indonesian text, aiming to support early mental health awareness. The proposed model integrates rule-based keyword matching with a Random Forest classifier, leveraging TF-IDF feature extraction from the preprocessed Indonesian Mental Health Conversation dataset. Evaluation against manually labeled data across eight distortion categories shows the hybrid approach outperforms standalone methods, achieving a classification accuracy of 77.5% and an exact match rate of 76.67%. The system demonstrated robust performance and fairness, maintaining a balanced label distribution across categories and achieving a validation accuracy of 94% on the full dataset. To validate real world applicability, the model was integrated into a reflective chatbot that successfully identifies distorted thinking patterns in user input and retrieves contextually relevant responses. These findings confirm that combining linguistic theory with data driven modeling creates an effective, interpretable, and scalable tool for cognitive distortion detection in informal Indonesian psychological text.

 

Keywords: cognitive distortion, natural language processing, hybrid model, Indonesian text, mental health

format_quote

How to Cite

file_copyCopy
[1]
Kusuma, E.P. et al. 2026. Detection and Classification of Cognitive Distortions in Mental Health Texts Using a Hybrid Natural Language Processing Approach. JASMINE: Journal of Intelligent Systems and Machine Learning. (Feb. 2026).

Downloads

Download data is not yet available.