Decision Tree-Based Early Warning System for Academic Failure: Comparative Analysis with Random Forest and Logistic Regression

group

Authors

  • Virzan Pasa Nugraha Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
  • Fidi Supriadi Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
  • David Setiadi Universitas Sebelas April, Sumedang, Jawa Barat, Indonesia
Issue 2026
Published 16 February 2026
Section Articles
description PDF
subject

Abstract

A Decision Tree–based early warning system for academic failure was evaluated using a dataset of 649 student observations and compared with Random Forest and Logistic Regression models through nested ten-fold cross-validation. The mean accuracy of the Decision Tree model was 0.9169, compared to 0.9260 for the Random Forest model and 0.8922 for the Logistic Regression model. Although the Random Forest model achieved the highest raw accuracy, paired statistical testing indicates that its performance difference with the Decision Tree model is not statistically significant (paired t-test, p = 0.104950). The difference between the Decision Tree model and the Logistic Regression model is statistically significant before Bonferroni correction (p = 0.01734) but becomes non-significant after adjustment (Bonferroni-adjusted p = 0.05201). The Decision Tree model was therefore selected to balance competitive predictive performance with interpretability through explicit and readable decision rules. In out-of-fold evaluation, the Random Forest model achieved the strongest results, with an accuracy of 0.9245, high receiver operating characteristic and precision–recall performance, a balanced accuracy of 0.8409, a minority-class recall of 0.7200, and twenty-eight false-negative predictions. Feature-importance analysis showed that G2 was the most influential variable with a relative importance of 0.2417, followed by G1 with 0.1992. Shapley value analysis and an ablation study further demonstrated that removing G1 and G2 substantially reduced overall accuracy and minority-class recall. These findings support the use of the Decision Tree model in educational contexts, where transparent rule-based decisions can guide early academic interventions while maintaining performance comparable to more complex ensemble methods.

Keywords: Decision Tree, Early Warning System, Academic Failure Prediction, Model Interpretability, Shapley Value Analysis

format_quote

How to Cite

file_copyCopy
[1]
Nugraha, V.P. et al. 2026. Decision Tree-Based Early Warning System for Academic Failure: Comparative Analysis with Random Forest and Logistic Regression. JASMINE: Journal of Intelligent Systems and Machine Learning. (Feb. 2026).

Downloads

Download data is not yet available.