A Hybrid Al-Biruni Earth Radius–Random Forest Model for

Accurate and Efficient Student Performance Classification

Mohamed E. Ghoneim1,*

1 Mathematics Department, Faculty of Science, Umm Al-Qura University, KSA

Emails: meghoneim@uqu.edu.sa

Received: December 30, 2025 Revised: February 28, 2026 Accepted: April 30, 2026 ⋆ Corresponding author

ABSTRACT

The growing availability of educational data has prompted the use of machine learning methods to predict student

academic performance and support data-driven decision-making in education. Nevertheless, such models for

predicting performance rely heavily on proper data preprocessing, model selection, and optimal hyperparameter

settings. This research proposes a hybrid predictive architecture that combines machine learning classifiers with

bio-inspired metaheuristic optimization algorithms to improve classification efficiency in educational data mining.

It is based on the xAPI-Edu.A dataset of 480 students’ demographic, academic, and behavioral characteristics is

used to first analyze a set of baseline machine learning models, including Random Forest, XGBoost, Support Vector

Machine, Multilayer Perceptron, K-Nearest Neighbors, and Gaussian Naive Bayes, using standard classification

metrics. The initial experimental findings on the baseline layer show that the Random Forest classifier outperforms

the other models before optimization, achieving accuracies of 0.8889 and 0.8814, and F-scores of 0.8889 and 0.8814,

respectively, indicating strong generalization and equal discrimination among the classes. To further improve the

predictive performance, the state-of-the-art metaheuristic algorithms, i.e., the Al-Biruni Earth Radius Optimizer

(BER), the Gray Wolf Optimizer (GWO), the Particle Swarm Optimization (PSO), the Genetic Algorithms (GA) and

the Whale Optimization Algorithms (WOA) are adopted to optimize the hyperparameters of the Random Forest.

It has been experimentally demonstrated that every optimization approach provides a measurable performance

increase, but the BER-optimized Random Forest consistently performs better. In particular, the BER-Random

Forest model achieves an F-score of 0.9477 and an accuracy of 0.9439, both of which are much higher than the

baseline configuration. Full statistical and visual analyses, such as kernel density estimation, Z-score heatmaps, and

swarm plots, also support the strength, stability and superiority of the proposed BER-based optimization framework.

Such findings demonstrate the effectiveness of metaheuristic-based hyperparameter optimization in educational

predictive analytics and provide significant insights into the creation of intelligent, efficient, and data-driven systems

of academic assistance.

Keywords: Student Academic Performance Machine Learning Metaheuristic Optimization Random Forest

Educational Data Mining

1. INTRODUCTION

In recent years, education has undergone a notable transformation

driven by technological advancements and the increasing

availability of educational data. This paradigmatic shift

has seen traditional pedagogical methods progressively replaced

by data-driven approaches aimed at enhancing student

performance and improving academic outcomes. Central to

35