Behind the scenes of educational data mining

Yael Feldman-Maggor, Sagiv Barhoom, Ron Blonder, Inbal Tuvi-Arad

Research output: Contribution to journalArticlepeer-review


Research based on educational data mining conducted at academic institutions is often limited by the institutional policy with regard to the type of learning management system and the detail level of its activity reports. Often, researchers deal with only raw data. Such data normally contain numerous fictitious user activities that can create a bias in the activity trends, consequently leading to inaccurate conclusions unless careful strategies for data cleaning, filtering, and indexing are applied. In addition, pre-processing phases are not always reported in detail in the scientific literature. As educational data mining and learning analytics methodologies become increasingly popular in educational research, it is important to promote researchers and educational policymakers’ awareness of the pre-processing phase, which is essential to create a reliable database prior to any analysis. This phase can be divided into four consecutive pre-processing stages: data gathering, data interpretation, database creation, and data organization. Taken together, these stages stress the technical and cooperative nature of this type of research, and the need for careful interpretation of the studied parameters. To illustrate these aspects, we applied these stages to online educational data collected from several chemistry courses conducted at two academic institutions. Our results show that adequate pre-processing of the data can prevent major inaccuracies in the research findings, and significantly increase the authenticity and reliability of the conclusions.

Original languageEnglish
Pages (from-to)1455-1470
Number of pages16
JournalEducation and Information Technologies
Issue number2
StatePublished - Mar 2021

Bibliographical note

Publisher Copyright:
© 2020, Springer Science+Business Media, LLC, part of Springer Nature.


  • Data pre-processing
  • Educational data mining
  • Higher education
  • Learning analytics
  • Learning management system (LMS)
  • Moodle


Dive into the research topics of 'Behind the scenes of educational data mining'. Together they form a unique fingerprint.

Cite this