Decision Tree-Based Early Warning System for Academic Failure: Comparative Analysis with Random Forest and Logistic Regression

group

Authors

  • Virzan Pasa Nugraha Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
  • Fidi Supriadi Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
  • David Setiadi Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
Issue 2026
Published 16 February 2026
Section Articles
description PDF
subject

Abstract

Student underachievement at the secondary education level remains a critical challenge demanding timely and interpretable interventions. This study develops and evaluates a Decision Tree-based model to predict student failure using the Student Performance dataset (n = 649). Two scenarios were investigated: an Early Warning Model (first-period grades) and a Mid-Term Model (first- and second-period grades). Findings reveal the Mid-Term Model delivers markedly higher predictive accuracy, underscoring the value of mid-term data for identifying students at risk. Comparative benchmarking against Random Forest and Logistic Regression used a robust 10-fold cross-validation methodology, incorporating nested hyperparameter tuning and synthetic oversampling resampling. Evaluation revealed that while Logistic Regression achieved the highest accuracy (92.30%) and Random Forest followed (91.38%), a robust paired t-test confirmed no statistically significant difference (p-value=0.0651 and p-value=0.1476, respectively, versus 0.05 threshold) compared to the Decision Tree (89.83%). Therefore, the Decision Tree was selected as the optimal model. It offers full interpretability at no statistically significant cost to accuracy, challenging the assumption that "black-box" models are inherently superior. Further analysis confirmed the first-period grade is the most influential predictor, offering opportunities for early intervention. The model’s interpretable rules, identifying “double warning” and “hidden risk” cases, offer actionable insights for targeted strategies to prevent student failure.

Keywords: Decision Tree, Early Warning System, Academic Failure Prediction, Model Interpretability, Shapley Value Analysis

format_quote

How to Cite

file_copyCopy
[1]
Nugraha, V.P. et al. 2026. Decision Tree-Based Early Warning System for Academic Failure: Comparative Analysis with Random Forest and Logistic Regression. JASMINE: Journal of Intelligent Systems and Machine Learning. (Feb. 2026).

Downloads

Download data is not yet available.