A Hybrid Al-Biruni Earth Radius–Random Forest Model for
Accurate and Efficient Student Performance Classification
Mohamed E. Ghoneim1,*
1 Mathematics Department, Faculty of Science, Umm Al-Qura University, KSA
Emails: meghoneim@uqu.edu.sa
Received: December 30, 2025 Revised: February 28, 2026 Accepted: April 30, 2026 ⋆ Corresponding author
ABSTRACT
The growing availability of educational data has prompted the use of machine learning methods to predict student
academic performance and support data-driven decision-making in education. Nevertheless, such models for
predicting performance rely heavily on proper data preprocessing, model selection, and optimal hyperparameter
settings. This research proposes a hybrid predictive architecture that combines machine learning classifiers with
bio-inspired metaheuristic optimization algorithms to improve classification efficiency in educational data mining.
It is based on the xAPI-Edu.A dataset of 480 students’ demographic, academic, and behavioral characteristics is
used to first analyze a set of baseline machine learning models, including Random Forest, XGBoost, Support Vector
Machine, Multilayer Perceptron, K-Nearest Neighbors, and Gaussian Naive Bayes, using standard classification
metrics. The initial experimental findings on the baseline layer show that the Random Forest classifier outperforms
the other models before optimization, achieving accuracies of 0.8889 and 0.8814, and F-scores of 0.8889 and 0.8814,
respectively, indicating strong generalization and equal discrimination among the classes. To further improve the
predictive performance, the state-of-the-art metaheuristic algorithms, i.e., the Al-Biruni Earth Radius Optimizer
(BER), the Gray Wolf Optimizer (GWO), the Particle Swarm Optimization (PSO), the Genetic Algorithms (GA) and
the Whale Optimization Algorithms (WOA) are adopted to optimize the hyperparameters of the Random Forest.
It has been experimentally demonstrated that every optimization approach provides a measurable performance
increase, but the BER-optimized Random Forest consistently performs better. In particular, the BER-Random
Forest model achieves an F-score of 0.9477 and an accuracy of 0.9439, both of which are much higher than the
baseline configuration. Full statistical and visual analyses, such as kernel density estimation, Z-score heatmaps, and
swarm plots, also support the strength, stability and superiority of the proposed BER-based optimization framework.
Such findings demonstrate the effectiveness of metaheuristic-based hyperparameter optimization in educational
predictive analytics and provide significant insights into the creation of intelligent, efficient, and data-driven systems
of academic assistance.
Keywords: Student Academic Performance Machine Learning Metaheuristic Optimization Random Forest
Educational Data Mining
1. INTRODUCTION
In recent years, education has undergone a notable transformation
driven by technological advancements and the increasing
availability of educational data. This paradigmatic shift
has seen traditional pedagogical methods progressively replaced
by data-driven approaches aimed at enhancing student
performance and improving academic outcomes. Central to
35