Enhancing academic performance prediction in online learning through hybrid machine learning models
Jamal Eddine Rafiq, Zakrani Abdelali, Mohammed Amraouy, Said Nouh
Abstract
Faced with the rise of online learning platforms, predicting learners’ academic performance has become a major concern to personalize and enhance educational journeys. However, traditional predictive models struggle to effectively integrate emotional and social factors. This article introduces a hybrid predictive model that combines random forests (RF) for selecting the most relevant features and multiple regression (MR) to forecast academic performance. The data is sourced from three online learning platforms and encompasses both implicit traces (learner interactions and behaviors) and explicit traces (demographic characteristics). Following a selection and merging process, the final dataset comprises 1,003,392 records and 42 features, categorized into six types of indicators: cognitive, emotional, social, normative, contextual, and demographic. The results demonstrate that this hybrid model outperforms traditional approaches and other machine learning (ML) techniques in terms of predictive accuracy, achieving an R² of 0.9372 and a root mean square error (RMSE) of 0.1022. The incorporation of explicit and implicit traces helps better capture the intricate interactions among the different data dimensions, significantly enhancing prediction quality. This work represents a notable advancement in the field of academic performance prediction. It also sheds light on challenges associated with the increasing complexity of models, paving the way for future research to develop more generalizable approaches.