Parallel algorithms for the automated discovery of declarative process models

•Variants of an approach for discovering declarative process models are presented.•The variants are based on the Apriori and the sequence analysis algorithm.•A comparative evaluation based on synthetic and real-life logs is carried out. The aim of process discovery is to build a process model from a...

Full description

Saved in:
Bibliographic Details
Published in:Information systems (Oxford) Vol. 74; pp. 136 - 152
Main Authors: Maggi, Fabrizio Maria, Di Ciccio, Claudio, Di Francescomarino, Chiara, Kala, Taavi
Format: Journal Article
Language:English
Published: Oxford Elsevier Ltd 01.05.2018
Elsevier Science Ltd
Subjects:
ISSN:0306-4379, 1873-6076
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Variants of an approach for discovering declarative process models are presented.•The variants are based on the Apriori and the sequence analysis algorithm.•A comparative evaluation based on synthetic and real-life logs is carried out. The aim of process discovery is to build a process model from an event log without prior information about the process. The discovery of declarative process models is useful when a process works in an unpredictable and unstable environment since several allowed paths can be represented as a compact set of rules. One of the tools available in the literature for discovering declarative models from logs is the Declare Miner, a plug-in of the process mining tool ProM. Using this plug-in, the discovered models are represented using Declare, a declarative process modeling language based on ltl for finite traces. However, the high execution times of the Declare Miner when processing large sets of data hampers the applicability of the tool to real-life settings. Therefore, in this paper, we propose a new approach for the discovery of Declare models based on the combination of an Apriori algorithm and a group of algorithms for Sequence Analysis to enhance the time performance of the plug-in. The approach has been developed in a way that it is easy to be parallelized using two different partitioning methods: the search space partitioning, in which different groups of candidate constraints are processed in parallel, and the database partitioning, in which different chunks of the log are processed at the same time. The approach has been implemented in ProM in its sequential version and in two multi-threading implementations leveraging these two partitioning methods. All the new variants of the plug-in have been evaluated using a large set of synthetic and real-life event logs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2017.12.002