EduForecast: A Comparative AI Model for Predicting Global Education Performance through XGBoost and Random Forest Intelligence

Authors

  • Jose C. Agoylo Jr. ORCiD BSIT Department, Southern Leyte State University – Tomas Oppus Campus, Southern Leyte, Philippines
  • Lykzelle Mae C. Padasas BSIT Department, Southern Leyte State University – Tomas Oppus Campus, Southern Leyte, Philippines
  • Gardenia B. Concillo BSIT Department, Southern Leyte State University – Tomas Oppus Campus, Southern Leyte, Philippines
  • Jimson A. Olaybar ORCiD Faculty of Computer Studies and Information Technology, Southern Leyte State University – Main Campus, Southern Leyte, Philippines
  • Alex C. Bacalla Faculty of Computer Studies and Information Technology, Southern Leyte State University – Main Campus, Southern Leyte, Philippines

Keywords:

AI Ensemble Models, education analytics, education forecasting, enrollment rate, machine learning, random forest, XGBoost

Abstract

Background and Objective: The study aims to develop and evaluate EduForecast, a predictive framework designed to estimate global educational performance. The primary objective is to compare the predictive accuracy of two ensemble machine-learning algorithms—Extreme Gradient Boosting (XGBoost) and Random Forest—using internationally sourced education indicators.

Materials and Methods: A comprehensive dataset encompassing key educational and socioeconomic variables was utilized, including GDP Share of Education, Literacy-to-Enrollment Ratio, Student–Teacher Ratio, and the Education Development Index. Enrollment Rate served as the target variable. Data preprocessing involved feature engineering and normalization procedures. Model development employed an 80–20 train–test split combined with five-fold cross-validation to ensure robustness. Both algorithms were trained and optimized using standard regression performance metrics.

Results: XGBoost demonstrated superior predictive performance, achieving an R² value of 0.90, compared with 0.85 for the Random Forest model. Additionally, XGBoost exhibited lower Root Mean Square Error (RMSE) and Mean Absolute Error (MAE), indicating higher precision and reduced prediction variability. The Education Development Index and Literacy-to-Enrollment Ratio emerged as the most influential predictors in both models.

Conclusion: The findings indicate that ensemble-based regression algorithms, particularly XGBoost, offer strong predictive capabilities for analyzing global education performance. The EduForecast framework provides a practical and transparent data-driven tool that can support policymakers and educational planners in evidence-based decision-making.

Downloads

Published

2025-12-19

Issue

Section

Research Articles

How to Cite

[1]
J. J. Agoylo, L. M. C. Padasas, G. B. Concillo, J. A. Olaybar, and A. C. Bacalla, “EduForecast: A Comparative AI Model for Predicting Global Education Performance through XGBoost and Random Forest Intelligence”, Insights Comput. Sci., vol. 1, pp. 19–25, Dec. 2025, Accessed: Feb. 25, 2026. [Online]. Available: https://acadpub.com/ics/article/view/eduforecast-ai-model-global-education-performance-xgboost-random-forest

Most read articles by the same author(s)