Event Log Quality & Gamification
Context
Data quality is critical for efficient and low-risk data-driven decision-making in organizations. Process mining concerns the analysis of event logs to provide a better understanding of the real processes executed within an organization to support decision-making. Low-quality event logs negatively affect the reliability of process mining results (i.e., garbage in, garbage out). Activity labels, the recorded names of the tasks performed in a process, are key elements of event logs. However, their quality can be compromised. Multiple activity labels with different syntax may refer to identical tasks. Detecting and repairing imperfect activity labels requires a deep insight into the domain involved to understand the meaning of labels. Domain experts are eminently suited to fix imperfect labels, but it is hard to engage them as repair can be time-consuming and tedious.
Gamification incorporates game elements in system design to improve user engagement with non-game tasks. It may offer a promising solution to the challenge of domain expert engagement with activity label quality improvement. This project proposes gamification approaches to detect and repair imperfect activity labels in event logs. This research examines the motivational drives that can be exploited through gamification for domain experts to engage in the repair of activity labels. The results of the evaluations show quality improvement of real-life event logs as well as a positive experience of participants with the systems. This project introduces a new generation of methods to data cleaning, which turns it from the most tedious task of data science into a fun and exciting experience.
The project aims to (i) improve the quality of activity labels in event logs using domain expert input, and (ii) support user engagement in detecting and repairing imperfect activity labels through the use of gamification techniques.
PhD Student and Supervisory Team
Sareh Sadeghianasl PhD student
Prof. Arthur ter Hofstede Principal supervisor
Prof. Moe Wynn
Collaborations
Dr Selen Turkay
Prof. Trina Meyers
Publications and Resources
The project has already resulted in the following software tools:
- An automatic approach (a ProM plugin) to detecting candidates of imperfect labels in process event logs – available here.
- The Quality Guardian gamified system to detect and repair imperfect activity labels – available here.
- The Quality Guardian Redux gamified system to engage domain experts in detecting and repairing imperfect activity labels – available here.
- The Quality Guardian Rosebud gamified system to create an activity ontology that can be used for repairing imperfect labels – available here.
The project has generated the following publications:
- Sareh Sadeghianasl, Arthur H.M. ter Hofstede, Moe T. Wynn, and Suriadi Suriadi: A Contextual Approach to Detecting Synonymous and Polluted Activity Labels in Process Event Logs. In International Conference on Cooperative Information Systems (CoopIS). LNCS, vol. 11877, pp. 76-94. Springer, 2019.