Predictive Modeling of Educational Outcomes Using Machine Learning for Data-Driven Regional Education Policy
Main Article Content
Educational inequality remains a major challenge across regions and is influenced by socioeconomic conditions, demographic characteristics, and access to post-secondary education. This study develops a predictive modeling framework using machine learning, specifically the Random Forest Regressor, to estimate regional educational outcomes and identify key determinants of academic performance. The dataset, which contained 1,104 observations and 31 predictors representing socioeconomic and educational indicators, was preprocessed through missing value imputation, one-hot encoding, and normalization to ensure data consistency and model reliability. The Random Forest model achieved high predictive accuracy, with a Mean Absolute Error (MAE) of 0.2656, a Root Mean Squared Error (RMSE) of 0.4168, and a Coefficient of Determination (R²) of 0.9881, explaining approximately 98.8 percent of the variance in regional education scores. Feature importance analysis indicated that academic attainment and post-secondary participation, such as Level 3 achievement at age 18 and higher qualification by age 22, were the most influential predictors of regional performance. These results highlight the critical role of educational progression in shaping long-term success and suggest that regions with sustained engagement in higher education tend to perform better overall. Visualization of predicted and actual scores confirmed the model’s robustness and its ability to generalize effectively across diverse regional profiles. The findings demonstrate that AI-based predictive analytics can accurately capture complex, nonlinear relationships within educational systems and provide a valuable foundation for data-driven policy formulation. Future research should incorporate longitudinal data, apply Explainable AI (XAI) methods to enhance interpretability, and extend this approach to cross-national datasets to support evidence-based educational governance.