TY - JOUR
T1 - Behind the scenes of educational data mining
AU - Feldman-Maggor, Yael
AU - Barhoom, Sagiv
AU - Blonder, Ron
AU - Tuvi-Arad, Inbal
N1 - Publisher Copyright:
© 2020, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2021/3
Y1 - 2021/3
N2 - Research based on educational data mining conducted at academic institutions is often limited by the institutional policy with regard to the type of learning management system and the detail level of its activity reports. Often, researchers deal with only raw data. Such data normally contain numerous fictitious user activities that can create a bias in the activity trends, consequently leading to inaccurate conclusions unless careful strategies for data cleaning, filtering, and indexing are applied. In addition, pre-processing phases are not always reported in detail in the scientific literature. As educational data mining and learning analytics methodologies become increasingly popular in educational research, it is important to promote researchers and educational policymakers’ awareness of the pre-processing phase, which is essential to create a reliable database prior to any analysis. This phase can be divided into four consecutive pre-processing stages: data gathering, data interpretation, database creation, and data organization. Taken together, these stages stress the technical and cooperative nature of this type of research, and the need for careful interpretation of the studied parameters. To illustrate these aspects, we applied these stages to online educational data collected from several chemistry courses conducted at two academic institutions. Our results show that adequate pre-processing of the data can prevent major inaccuracies in the research findings, and significantly increase the authenticity and reliability of the conclusions.
AB - Research based on educational data mining conducted at academic institutions is often limited by the institutional policy with regard to the type of learning management system and the detail level of its activity reports. Often, researchers deal with only raw data. Such data normally contain numerous fictitious user activities that can create a bias in the activity trends, consequently leading to inaccurate conclusions unless careful strategies for data cleaning, filtering, and indexing are applied. In addition, pre-processing phases are not always reported in detail in the scientific literature. As educational data mining and learning analytics methodologies become increasingly popular in educational research, it is important to promote researchers and educational policymakers’ awareness of the pre-processing phase, which is essential to create a reliable database prior to any analysis. This phase can be divided into four consecutive pre-processing stages: data gathering, data interpretation, database creation, and data organization. Taken together, these stages stress the technical and cooperative nature of this type of research, and the need for careful interpretation of the studied parameters. To illustrate these aspects, we applied these stages to online educational data collected from several chemistry courses conducted at two academic institutions. Our results show that adequate pre-processing of the data can prevent major inaccuracies in the research findings, and significantly increase the authenticity and reliability of the conclusions.
KW - Data pre-processing
KW - Educational data mining
KW - Higher education
KW - Learning analytics
KW - Learning management system (LMS)
KW - Moodle
UR - http://www.scopus.com/inward/record.url?scp=85090153832&partnerID=8YFLogxK
U2 - 10.1007/s10639-020-10309-x
DO - 10.1007/s10639-020-10309-x
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85090153832
SN - 1360-2357
VL - 26
SP - 1455
EP - 1470
JO - Education and Information Technologies
JF - Education and Information Technologies
IS - 2
ER -