Process mining approaches kashif.namal@gmail.com

By
Kashif Kashif
Kashif.namal@gmail.com
University of Camerino Italy

Process Mining
 A Process Managmenet technique that allows for the
analysis of business Process based on event logs.
 Algorithms are applied to event log datasets to find
patterns and details contained in event logs recorded
by an information system
 Objective is Effiecient and improve

Classification
 Discovery
A discovery technique takes an event log and
produces a process model without using any a-priori
information.
 Conformance checking
An existing process model is compared with an
event log of the same process.
 Enhancement
The main idea is to extend or improve an existing process
model using information about the actual process
recorded in some event log.

Approach Used
 Direct Algorithmic Approaches
 Two-Phase Approaches
 Computational Intelligence Approaches
 Partial Approaches

Direct Algorithmic Approaches
 Extracts footprint from the event log and uses this
footprint to directly construct a process model
 Also called language-based regions
 Extracted from the log and based on this relation a
Petri net is constructed
 Alpha Algorithem is Example of Direct Approach
 We apply an algorithm on the logs and derive directly
the process model

Two Phases Approach
 Uses a two-step approach in which first a “low-level model” (e.g., a
transition system , Markov model) is constructed.
 2nd step is that low-level model is converted into a “high-level model”
that can express concurrency and other (more advanced) control-flow
patterns.
 Transition system is extracted from the log using a customizable
abstraction mechanism.
 Transition system is converted into a Petri net using called statebased
regions The resulting model can be visualized as a Petri net, but can
also be converted into other notations (e.g., BPMN and EPCs).
 Similar approaches can be envisioned using hidden Markov models.
Using an Expectation-Maximization(EM) algorithm such as the Baum–
Welch algorithm, the “most likely” Markov model can be derived from
a log.
 Model is converted into highlevel model.

Hidden Morkov Model
 Set of states: {s1, s2, s3…. sn }
 Process moves from one state to another generating
 a sequence of states : s1, s2….
 Markov chain property: probability of each subsequent
state depends only on what was the previous state

Hidden Morkov Model
 You are going to find robot mood that either rebot is
happy or sad by watching movie(W), sleeping S,
Crying C, Facebook F.
 X=h if you happy X=s if unknown Y observation . w, s,
c or f .
 We want to answer queries, such as:
 P(X= h|Y =f) ?
 P(X= s|Y =c) ?

Computational Intelligence
Approaches
 Techniques originating from the field of computational intelligence form the
basis for the third family of process discovery approaches.
 Examples of techniques are genetic programming, genetic algorithms,
simulated annealing, reinforcement learning, machine learning, neural
networks, fuzzy sets, rough sets, and swarm intelligence.
 The log is not directly converted into a model but uses an iterative procedure to
mimic the process of natural evaluation.
 Using genetic process mining approach starts with initial population of
individuals. Each individual corresponds to a randomly generated process
model. For each individual a fitness value is computed describing how well the
model fits with the log.
 Populations evolve by selecting the fittest individuals and generating new
individuals using genetic operators such as crossover (combining parts of two
individuals) and mutation (random modification of an individual). The fitness
gradually increases from generation to generation. The process stops once an
individual of acceptable quality is found.

Machine Learning
 Determine rules from data/facts
 Improve performance with experience
 Getting computers to program themselves

Sketch of an Induction Algorithm
 Calculate for each attribute,
 how good it classifies the elements of the training set
 Classify with the best attribute
 Repeat for each resutling subtree the first two steps
 Stop this recursive process as soon as a termination
condition is satisfied

Partial Approaches
 The approaches produce a complete end-to-end process
model.
 It is also possible to focus on rules or frequent patterns
approach for mining of sequential patterns.
 This approach is similar to the discovery of association
rules, however, now the order of events is taken into
account.
 Here a sliding window is used to analyze how frequent an
“episode” ( partial order) is appearing.
 Approaches exist to learn declarative (LTL-based)
languages like Declare.

PROLOG
 PROLOG (=PROgramming in LOGic) is a
programming language based on Horn clauses
 father(peter,mary).
 father(peter,john).
 mother(mary,mark).
 mother(jane,mary).
 grandfather(X,Z) :- father(X,Y), father(Y,Z).
grandfather(X,Z) :- father(X,Y), mother(Y,Z).

Heuristic miner
 Heuristics Miner is a practical applicable mining algorithm
that can deal with noise, and can be used to express the
main behavior that is not all details and exceptions,
registered in an event log.
 Extends alpha algorithm by considering the frequency of
traces in the log.
 The Heuristics Miner Plug-in mines the control flow
perspective of a process model.
 Considers the order of the events within a case.
 these algorithms take frequencies of events and sequences
into account when constructing a process model

Steps
 The construction of the dependency graph
 For each activity, the construction of the input and output
expressions
 The search for long distance dependency relations
 1. Read a log
 2. Get the set of tasks
 3. Infer the ordering relations based on their frequencies
 4. Build the net based on inferred relations
 5. Output the net

Genetic Miner
 Genetic miner uses a genetic algorithm to mine a petri
net representation of the process model from
execution traces.
 A global search strategy (the quality or fitness of a
candidate model is calculated by comparing the
process model with all traces in the event log the
search process takes place at a global level. For a local
strategy there is no guarantee that the outcome of the
locally optimal steps

Steps
 The first is to define the internal representation.
 The second concern is to define the fitness measure.
 The third concern relates to the genetic operators
(crossover and mutation)
 Read event log
 Build the initial population
 Calculate fitness of the individuals in the population
 Stop and return the fittest individuals
 Create next population

Fuzzy miner
 Process Mining is a technique for extracting process
models from execution logs.
 People have an idealized view of reality.
 Real-life processes turn out to be less structured than
people tend to believe.
 Model spaghetti-like

Output
 Phase I: Fuse similar behaving attributes
 Phase II: Generate Meta rules
 Phase III: Generate frequent fuzzy itemsets
 Phase IV: Make fuzzy association rules.

Process mining approaches kashif.namal@gmail.com

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (16)

Similar to Process mining approaches kashif.namal@gmail.com

Similar to Process mining approaches kashif.namal@gmail.com (20)

Recently uploaded

Recently uploaded (20)

Process mining approaches kashif.namal@gmail.com