Mining access logs with predictive analytics to improve student performance

Corrigan, Owen, Glynn, Mark, Smeaton, Alan F. and Smyth, Sinéad (2015) Mining access logs with predictive analytics to improve student performance.


Access log files are an important part of the development of computing, developed to keep a record of what has been happening in a system so that it can be subsequently searched as part of debugging or error-fixing. That’s how access logs started, mostly used as some form of record for forensic investigation and not much value beyond that. With the advent of new techniques for data analytics, however, access log files are suddenly becoming quite valuable when they are mined, as opposed to searched, for actionable knowledge. In some applications, new forms of data mining and visualisation are making log files more valuable than the processes that they record. The best example of this is web search, where search engine log files which record users’ searches, are mined for services from query auto-complete which automatically completes users’ queries based on previous queries, to Google Flu Trends which predicts outbreaks of influenza based on mining users’ searches for remedies for flu symptoms. The common feature across these, and many other, uses for log files is prediction based on machine learning, using logs of past activities to predict outcomes likely to happen in the future. Learning Management Systems (LMS) in education applications automatically create log files which record students’ interactions with the online environment, recording timestamps, IP address of the accessing device, the content that is accessed and any inputs the students provide. Such access log files have been used extensively for post-hoc analysis of course outcomes, correlating with students’ online behaviour mined from the log files and attempting to determine the factors which influence better student outcomes, which content to access, when, from where, in what order, and so on. In our work we are interested in applying predictive analytics to LMS log files in order to improve student outcomes when participating in courses. Our work is based on the Loop LMS system used throughout Dublin City University. We chose 10 modules which are delivered to first year undergraduate students and which have high rates of participation, and a high failure rate (75% to 90% pass). In total, 1,200 students registered to have their progress through the module monitored and predictions as to their likely pass/fail outcome emailed to them on a weekly basis using predictive analytics applied to the log files. In parallel, 400 students self-selected not to have these alerts and these provide the baseline or control group against which we can compare the impact of the alerts. There is no difference between the two groups of students academically, in term of CAO points or performance in Leaving Certificate Mathematics. Our work was done with approval from the University’s Research Ethics Committee, data protection officer and those who availed of the alerts were all over 18 years of age and read and confirmed understanding of a plain language statement of what the system was doing with their data. The impact of receiving weekly alerts derived from predictive analytics is that the average student mark in the end-of-semester examination increased by 2.9% across the 10 modules. 8 of the 10 modules demonstrated increased student marks from students who received the predictive alerts, demonstrating that predictive analytics can be usefully applied to statistically significantly improve student performance.

View Item