Evaluation of Machine Learning Algorithms for Predicting Phishing Attacks in Higher Education Environments: An Experimental Framework for Enhancing Cybersecurity in Academic Institutions

Akmal Muhammad Poetra; Dody Herdiana; Muhammad Agreindra Helmiawan

Evaluation of Machine Learning Algorithms for Predicting Phishing Attacks in Higher Education Environments

An Experimental Framework for Enhancing Cybersecurity in Academic Institutions

home
2026
Evaluation of Machine Learning Algorithms for Predicting Phishing Attacks in Higher Education Environments: An Experimental Framework for Enhancing Cybersecurity in Academic Institutions

group

Authors

Akmal Muhammad Poetra Universitas Sebelas April
Dody Herdiana Universitas Sebelas April, Sumedang
Muhammad Agreindra Helmiawan Asia e University Wisma Subang Jaya

Issue	2026
Published	12 May 2026
Section	Articles

description PDF

subject

Abstract

This study evaluates the performance of several machine learning algorithms Logistic Regression, Support Vector Machine, Random Forest, and XGBoost in predicting phishing attacks within higher education environments. Due to the limited availability of anonymized institutional datasets, the research employs a conceptual experiment design and simulation-based approach that mirrors the characteristics of phishing incidents commonly encountered by academic users. The simulated dataset includes URL-based indicators, HTML features, email text elements, and behavioral metadata. The experimental protocol covers synthetic data generation, domain-specific feature engineering, stratified k-fold cross-validation, hyperparameter tuning via grid search, and performance evaluation using accuracy, precision, recall, F1-score, and ROC/AUC. The simulation results indicate that ensemble-based models (Random Forest and XGBoost) outperform linear and kernel-based models, especially in scenarios with class imbalance typical of campus environments. The discussion highlights implications for real-world campus cybersecurity operations, limitations of conceptual simulations, and future research needs such as real-world validation and the integration of user behavior features. The main contribution is a complete experimental framework that can be executed with real institutional datasets, providing guidance for model selection and deployment in higher education cybersecurity systems.

Keywords: phishing detection, machine learning, Random Forest, XGBoost, higher education

format_quote

How to Cite

file_copyCopy

[1]

Muhammad Poetra, A. et al. 2026. Evaluation of Machine Learning Algorithms for Predicting Phishing Attacks in Higher Education Environments: An Experimental Framework for Enhancing Cybersecurity in Academic Institutions. JASMINE: Journal of Intelligent Systems and Machine Learning. (May 2026).

ACM ACS APA ABNT Chicago Harvard IEEE MLA Turabian Vancouver

Download Endnote/Zotero/Mendeley (RIS) Download BibTeX

Downloads

Download data is not yet available.