Gray, Geraldine, McGuinness, Colm, Owende, Philip and Hofmann, Markus (2016) Learning Factor Models of Students at Risk of Failing in the Early Stage of Tertiary Education. Journal of Learning Analytics, 3 (2). pp. 330-372.
This paper reports on a study to predict students at risk of failing based on data available prior to commencement of first year. The study was conducted over three years, 2010 to 2012, on a student population from a range of academic disciplines, n=1,207. Data was gathered from both student enrollment data and an online, self-reporting, learner-profiling tool administered during first-year student induction. Factors considered included prior academic performance, personality, motivation, self-regulation, learning approaches, age, and gender. Models were trained on data from the 2010 and 2011 student cohort, and tested on data from the 2012 student cohort. A comparison of eight classification algorithms found k-NN achieved best model accuracy (72%), but results from other models were similar, including ensembles (71%), support vector machine (70%), and a decision tree (70%). However, improvements in model accuracy attributable to non-cognitive factors were not significant. Models of subgroups by age and discipline achieved higher accuracies, but were affected by sample size; n<900 underrepresented patterns in the dataset. Factors most predictive of academic performance in first year of study at tertiary education included age, prior academic performance, and self-efficacy. Early modelling of first-year students yielded informative, generalizable models that identified students at risk of failing.