Improving the Accuracy of the C4.5 Algorithm in Heart Disease Prediction Using Bagging and Information Gain

group

Authors

  • Ernawati Universitas Sumatera Utara
  • Ade Candra
  • Syahril Efendi
Issue Vol. 12 No. 1 (2026)
Published 2 June 2026
Section Articles
Pages 1-13
description pdf
subject

Abstract

Class imbalance is a common challenge in data classification, where the majority class significantly outnumbers the minority class, leading to a decrease in algorithm performance, particularly for the C4.5 algorithm. This study aims to address this problem by proposing a combination of Bootstrap Aggregation (Bagging) and Information Gain (IG). The IG method is employed for feature selection using a threshold of > 0.02 to select the most relevant attributes, while Bagging functions to enhance the stability and accuracy of the classification model. The experiment was conducted using a diabetes dataset from UCI with 10-fold cross-validation validation. The results showed that the C4.5+Bagging model achieved the highest accuracy at 95.96%, while the proposed C4.5+IG+Bagging combination reached an accuracy of 94.42%, a significant increase from the baseline C4.5 algorithm's accuracy of 89.04%. These findings demonstrate that the proposed method combination is effective in improving classification performance on imbalanced data

format_quote

How to Cite

file_copyCopy
[1]
Ernawati et al. 2026. Improving the Accuracy of the C4.5 Algorithm in Heart Disease Prediction Using Bagging and Information Gain. IJoICT (International Journal on Information and Communication Technology). 12, 1 (Jun. 2026), 1–13.

Downloads

Download data is not yet available.