Dirk Fahland
         Wil M.P. van der Aalst

          Simplifying
Mined Process Models
Process Mining, Currently


event   process mining   process
 log       algorithm      model




                                   PAGE 1
Process Mining, Currently


                                   readable
event   process mining   process    process
 log       algorithm      model      model




                                   PAGE 2
Post-Process the Model


                                                          readable
 event     process mining    process
                                         simplify          process
  log         algorithm       model                         model




can replay the entire log       introduce:
                            post-processing           can replay
                            operations on the       the entire log
                              mined model
                                                           PAGE 3
…Based on Original Event Log

           process mining    process
              algorithm       model                       readable
 event                                   simplify          process
  log                                                       model




can replay the entire log       introduce:
                            post-processing           can replay
                            operations on the       the entire log
                              mined model
                                                           PAGE 4
Analysis

          process mining   process
             algorithm      model                    readable
 event                                   simplify     process
  log                                                  model




          discover ordering relations
                    infer behavior



                                                    behavior
   observed executions                  generalized behavior
incomplete knowledge
                                                     PAGE 5
Idea: Re-Adjust Generalization

             process mining       process
                algorithm          model                    readable
  event                                         simplify     process
   log                                                        model




                        unfold model wrt. log
   model
complexity


                          fold, simplify,
                          generalize


                                                           behavior
               log
                                                            PAGE 6
Unfold a Spaghetti-Model




                           PAGE 7
Unfold Model wrt. a Log


                                A
                 ABDA
                 ABCBDA
                 ABCBC
                   log      C   B



                                D

                              mined
                          process model




                                    PAGE 8
Unfold Model wrt. a Log
                 unfold
    A
                                A
                 ABDA
                 ABCBDA
    B            ABCBC
                   log      C   B
    D


    A                           D

                              mined
                          process model




                                    PAGE 9
Unfold Model wrt. a Log
                 unfold
      A
                                A
                 ABDA
                 ABCBDA
      B          ABCBC
                   log      C   B
  C   D


  B   A                         D

                              mined
  D
                          process model


  A

                                    PAGE 10
Unfold Model wrt. a Log
                 unfold
        A
                                A
                 ABDA
                 ABCBDA
        B        ABCBC
                   log      C   B
    C   D


    B   A                       D

                              mined
    D
                          process model
C


    A

                                    PAGE 11
Unfold Model wrt. a Log
                       unfold
        A
                                      A
                       ABDA
                       ABCBDA
        B              ABCBC
                         log      C   B
    C   D


    B   A                             D

                                    mined
    D   B
                                process model
C


B   A
        unfolding
        wrt. the log                      PAGE 12
Represents Concurrency
                       unfold
        A
                                       A
                       AEBDA
                       ABECBDA
        B   E          ABCBC
                         log       C   B       E
    C   D


    B   A                              D

                                     mined
    D
                                 process model
C


    A
        unfolding
        wrt. the log                       PAGE 13
Represents Concurrency

        A
                            AEBDA
                            ABECBDA
        B   E               ABCBC
                               log
    C   D

                       •   is a process model
    B   A              •   contains only behavior in the log
                       •   is acyclic
C   D
                       •   represents concurrency explicitly
                       •   labeled
                           (several tasks with same label)
    A
        unfolding
        wrt. the log                                         PAGE 14
Represents Concurrency

        A
                       AEBDA
                       ABECBDA
        B   E          ABCBC
                         log
    C   D
                                    unfold

    B   A

                                 fold,
                                 simplify,
C   D
                                 generalize


    A
        unfolding
        wrt. the log                          PAGE 15
Fold an unfolded model

        A
                merge equivalent nodes

        B   E   necessary condition on
                equivalent transitions
    C   D        • same label

    B   A


C   D


    A

                                         PAGE 16
Fold an unfolded model

        A
                merge equivalent nodes

        B   E   necessary condition on
                equivalent transitions
    C   D        • same label
                 • equivalent pre-/post-places
    B   A


C   D


    A

                                                 PAGE 17
Fold an unfolded model

        A
                merge equivalent nodes

        B   E   necessary condition on
                equivalent transitions
    C   D        • same label
                 • equivalent pre-/post-places
    B   A
                various equivalences possible
                (see paper for some)
C   D


    A

                                                 PAGE 18
Fold an unfolded model

        A
                merge equivalent nodes

        B   E
                     A

    C   D

                 C   B   E
    B   A


                     D
C   D


    A                A

                                         PAGE 19
Unfolding and Refolding
                     unfold
                                                A

          fold
                      A
                                            C   B       E


                 C    B   E
                                                D


                      D
                          refolded vs. original model
                          • less behavior
                            (replays the log and more)
                      A
                          • simpler structure
                                                    PAGE 20
Next: Simplifying and Generalizing


                                                      readable
                             process
                                        simplify
                                          simplify     process
                              model
                                                        model




                          unfold
complexity
                   fold


                           simplify,
                           generalize

                                                     behavior
             log

                                                      PAGE 21
Implied Places

        A
                           implied place
                           • does not restrict transitions
        B   fold
                       A
                           remove from folded model
    C   D                  • simpler model
                   C   B   • same behavior
    B   A


                       D
                           various techniques to find
C   D                      implied places

    A                  A

                                                PAGE 22
Special: Implied Places and Folding

    A                     A
p                     p
                                              A
    B     C          D        C           p
                                  fold
                                         D    B      C
        unfolding wrt. log

 folding may merge implied and non-implied places
 remove p: simpler model,
             more behavior (generalization)
 let user decide
                                                  PAGE 23
Configurable Simplification


                                                               readable
                             process
                                                 simplify
                                                   simplify     process
                              model
                                                                 model




                          unfold
complexity
                   fold
                                          configurable


                                     simplify,
                                   generalize
                                                              behavior
             log

                                                               PAGE 24
ProM6 / Uma > www.processmining.org




                                  PAGE 25
ProM6 / Uma > www.processmining.org




                                  PAGE 26
ProM6 / Uma > www.processmining.org




                                  PAGE 27
Experimental Results

 15 benchmark logs, 6 industrial logs
 [www.promtools.org/prom5/]




                                         PAGE 28
Experimental Results

  15 benchmark logs, 6 industrial logs
      [www.promtools.org/prom5/]
 model complexity = #arcs / #nodes
9.0
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0




                                          PAGE 29
Experimental Results
 precision: traces allowed by model and not in log
 1.0 = only log behavior allowed




 rises/falls within limits (can be controlled)
                                                  PAGE 30
from    to
spaghetti   lasagna?
from    to less complex
spaghetti   spaghetti
Lessons Learned

 techniques to navigate the model/behavior space
 use model and log together
 use model unfoldings
 break a rule and see what happens

                          unfold
   model
complexity
                   fold



                                     simplify,
                                   generalize
                                                 behavior
             log
                                                  PAGE 33
And next?

         process mining   process
            algorithm      model               readable
 event                              simplify    process
  log                                            model




 process views
  most simple model covering 80% of the log

 improve mining algorithms?
  we showed: there is room for improvement

                                               PAGE 34
Dirk Fahland
         about.me/dirk.fahland

          Simplifying
Mined Process Models

Simplifying Mined Process Models

  • 1.
    Dirk Fahland Wil M.P. van der Aalst Simplifying Mined Process Models
  • 2.
    Process Mining, Currently event process mining process log algorithm model PAGE 1
  • 3.
    Process Mining, Currently readable event process mining process process log algorithm model model PAGE 2
  • 4.
    Post-Process the Model readable event process mining process simplify process log algorithm model model can replay the entire log introduce: post-processing can replay operations on the the entire log mined model PAGE 3
  • 5.
    …Based on OriginalEvent Log process mining process algorithm model readable event simplify process log model can replay the entire log introduce: post-processing can replay operations on the the entire log mined model PAGE 4
  • 6.
    Analysis process mining process algorithm model readable event simplify process log model discover ordering relations  infer behavior behavior observed executions generalized behavior incomplete knowledge PAGE 5
  • 7.
    Idea: Re-Adjust Generalization process mining process algorithm model readable event simplify process log model unfold model wrt. log model complexity fold, simplify, generalize behavior log PAGE 6
  • 8.
  • 9.
    Unfold Model wrt.a Log A ABDA ABCBDA ABCBC log C B D mined process model PAGE 8
  • 10.
    Unfold Model wrt.a Log unfold A A ABDA ABCBDA B ABCBC log C B D A D mined process model PAGE 9
  • 11.
    Unfold Model wrt.a Log unfold A A ABDA ABCBDA B ABCBC log C B C D B A D mined D process model A PAGE 10
  • 12.
    Unfold Model wrt.a Log unfold A A ABDA ABCBDA B ABCBC log C B C D B A D mined D process model C A PAGE 11
  • 13.
    Unfold Model wrt.a Log unfold A A ABDA ABCBDA B ABCBC log C B C D B A D mined D B process model C B A unfolding wrt. the log PAGE 12
  • 14.
    Represents Concurrency unfold A A AEBDA ABECBDA B E ABCBC log C B E C D B A D mined D process model C A unfolding wrt. the log PAGE 13
  • 15.
    Represents Concurrency A AEBDA ABECBDA B E ABCBC log C D • is a process model B A • contains only behavior in the log • is acyclic C D • represents concurrency explicitly • labeled (several tasks with same label) A unfolding wrt. the log PAGE 14
  • 16.
    Represents Concurrency A AEBDA ABECBDA B E ABCBC log C D unfold B A fold, simplify, C D generalize A unfolding wrt. the log PAGE 15
  • 17.
    Fold an unfoldedmodel A merge equivalent nodes B E necessary condition on equivalent transitions C D • same label B A C D A PAGE 16
  • 18.
    Fold an unfoldedmodel A merge equivalent nodes B E necessary condition on equivalent transitions C D • same label • equivalent pre-/post-places B A C D A PAGE 17
  • 19.
    Fold an unfoldedmodel A merge equivalent nodes B E necessary condition on equivalent transitions C D • same label • equivalent pre-/post-places B A various equivalences possible (see paper for some) C D A PAGE 18
  • 20.
    Fold an unfoldedmodel A merge equivalent nodes B E A C D C B E B A D C D A A PAGE 19
  • 21.
    Unfolding and Refolding unfold A fold A C B E C B E D D refolded vs. original model • less behavior (replays the log and more) A • simpler structure PAGE 20
  • 22.
    Next: Simplifying andGeneralizing readable process simplify simplify process model model unfold complexity fold simplify, generalize behavior log PAGE 21
  • 23.
    Implied Places A implied place • does not restrict transitions B fold A remove from folded model C D • simpler model C B • same behavior B A D various techniques to find C D implied places A A PAGE 22
  • 24.
    Special: Implied Placesand Folding A A p p A B C D C p fold D B C unfolding wrt. log  folding may merge implied and non-implied places  remove p: simpler model, more behavior (generalization)  let user decide PAGE 23
  • 25.
    Configurable Simplification readable process simplify simplify process model model unfold complexity fold configurable simplify, generalize behavior log PAGE 24
  • 26.
    ProM6 / Uma> www.processmining.org PAGE 25
  • 27.
    ProM6 / Uma> www.processmining.org PAGE 26
  • 28.
    ProM6 / Uma> www.processmining.org PAGE 27
  • 29.
    Experimental Results  15benchmark logs, 6 industrial logs [www.promtools.org/prom5/] PAGE 28
  • 30.
    Experimental Results 15 benchmark logs, 6 industrial logs [www.promtools.org/prom5/] model complexity = #arcs / #nodes 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 PAGE 29
  • 31.
    Experimental Results  precision:traces allowed by model and not in log  1.0 = only log behavior allowed  rises/falls within limits (can be controlled) PAGE 30
  • 32.
    from to spaghetti lasagna?
  • 33.
    from to less complex spaghetti spaghetti
  • 34.
    Lessons Learned  techniquesto navigate the model/behavior space  use model and log together  use model unfoldings  break a rule and see what happens unfold model complexity fold simplify, generalize behavior log PAGE 33
  • 35.
    And next? process mining process algorithm model readable event simplify process log model  process views most simple model covering 80% of the log  improve mining algorithms? we showed: there is room for improvement PAGE 34
  • 36.
    Dirk Fahland about.me/dirk.fahland Simplifying Mined Process Models