Predicting Student Academic Performance in Secondary Education Using Machine Learning Techniques
Main Article Content
Predicting student academic performance has become a critical area of educational research, particularly with the increasing availability of large-scale data and the advancement of artificial intelligence applications in learning analytics. This study investigates the effectiveness of machine learning techniques in forecasting academic performance among secondary school students using state-level educational data. Two ensemble-based algorithms, Random Forest and Gradient Boosting, were implemented to model the complex relationships between student, institutional, and demographic variables. The models were evaluated using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R²) to ensure robust performance assessment. The Gradient Boosting model achieved the highest predictive accuracy, with an R² score of approximately 0.68, outperforming the Random Forest model, which achieved an R² of 0.58. Feature importance analysis revealed that factors such as exam participation, gender-based pass rates, teacher availability, school infrastructure, and inclusive education indicators were the most significant predictors of academic outcomes. These findings highlight the role of machine learning as a powerful tool for educational data mining, enabling policymakers and educators to identify key determinants of student success and design data-driven interventions. Future work should incorporate socio-economic and behavioral factors and employ explainable AI approaches to improve the transparency, fairness, and interpretability of predictive educational models.