Data Mining Interview: Wil van der Aalst
Data Mining Research proposes today an interview with Wil van der Aalst, an expert in process mining. To know more about him and this emerging field of research, continue reading. Thanks Wil for your answers.
Data Mining Research: Could you introduce yourself? What is your journey in the field of data mining?
Wil van der Aalst: Before working on process mining, I worked on Petri nets, the modeling and analysis of workflow processes, workflow patterns, and process-aware information systems. I also started the main scientific conference on Business Process Management and I’m the founder and leader of the IEEE Task Force on Process Mining. I got into the topic of process mining because I got bored with purely model-based research. For example, it is unsatisfactory to work on the verification and performance analysis of process models knowing that these models have nothing to do with reality. Process mining provides an important bridge between data mining and business process modeling and analysis. Process mining research at TU/e (Eindhoven University of Technology) started in 1999. At that time there was little event data available and the initial process mining techniques were extremely naïve and hence unusable. Over the last decade event data has become readily available and process mining techniques have matured. Moreover, process mining algorithms have been implemented in various academic and commercial systems. Today, there is an active group of researchers working on process mining and it has become one of the “hot topics” in BPM research. Moreover, there is a huge interest from industry in process mining and more and more software vendors started adding process mining functionality to their tools.
DMR: What is process mining?
WvdA: The idea of process mining is to discover, monitor and improve real processes (i.e., not assumed processes) by extracting knowledge from event logs readily available in today’s (information) systems. Process mining includes (automated) process discovery (i.e., extracting process models from an event log), conformance checking (i.e., monitoring deviations by comparing model and log), social network/organizational mining, automated construction of simulation models, model extension, model repair, case prediction, and history-based recommendations (cf. www.processmining.org).
The most appealing form of process mining is process discovery. A discovery technique takes an event log and produces a model without using any a-priori information. For many organizations it is surprising to see that existing techniques are indeed able to discover real processes merely based on example behaviors recorded in event logs. Another form of process mining is conformance checking. Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. A third type of process mining is model enhancement. Here, the idea is to extend or improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. For instance, by using timestamps in the event log one can extend the model to show bottlenecks, service levels, throughput times, and frequencies.
DRM: What are the benefits for companies to apply process mining?
WvdA: Although event data are omnipresent, organizations lack a good understanding of their actual processes. Management decisions tend to be based on PowerPoint diagrams, local politics, or management dashboards rather than a careful analysis of event data. The knowledge hidden in event logs cannot be turned into actionable information. Advances in data mining made it possible to find valuable patterns in large datasets and to support complex decisions based on such data. However, classical data mining problems such as classification, clustering, regression, association rule learning, and sequence/episode mining are not process-centric. Therefore, Business Process Management (BPM) approaches tend to resort to hand-made models. Process mining research aims to bridge the gap between data mining and BPM. Metaphorically, process mining can be seen as taking X-rays to diagnose/predict problems and recommend treatment.
DMR: What are the future works in this field from a research point of view?
WvdA: The growing maturity of process mining is illustrated by the Process Mining Manifesto recently released by the IEEE Task Force on Process Mining. This manifesto is supported by 53 organizations and 77 process mining experts contributed to it. The active contributions from end-users, tool vendors, consultants, analysts, and researchers illustrate the significance of process mining as a bridge between data mining and business process modeling. The manifesto lists six guiding principles and eleven challenges. As an example, consider Challenge C4 in the manifesto: “Dealing with Concept Drift.” The term concept drift refers to a situation in which the process is changing while we’re analyzing it. For instance, in the beginning of the event log, two activities might be concurrent, whereas later in the log, they become sequential. Processes might change because of periodic or seasonal changes (for example, “in December, there is more demand” or “on Friday afternoon, fewer employees are available”) or changing conditions (“the market is getting more competitive”). Such changes impact processes, and detecting and analyzing them is vital. However, most process-mining techniques analyze processes as if they’re in steady state. This is just one of many open problems in the process mining field.
Given these challenges and the interest of industry, I hope that more and more data miners will start working on this exiting topic. In short: “It’s the Process Stupid!”, so start mining processes rather than data.
About Wil van der Aalst
Prof.dr.ir. Wil van der Aalst is a full professor of Information Systems at the Technische Universiteit Eindhoven (TU/e). Currently he is also an adjunct professor at Queensland University of Technology (QUT) working within the BPM group there. His research interests include workflow management, process mining, Petri nets, business process management, process modeling, and process analysis. More information: www.vdaalst.com