Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Context Analysis of Business Processes Based on Event Logs
1. Master Thesis Defense
Context Analysis of Business
Processes Based on Event Logs
Airlangga Adi Hermawan (0729134)
Supervisor: Wil v.d. Aalst
Daily tutor: A. Adriansyah
Assessment committee:
M. de Leoni
B. Skoric
09-04-13 Eindhoven
2. Introduction
Process
aabb
abc
Process mining
aaabb
Event log
09-04-13 PAGE 2
3. Example of process: construction permit
Case id
Registration Eligibility check Archive
1
Material check Archive
2
Not flexible, limited insights
Environmental check Eligibility check Archive
3
Legend: Timestamp (e.g. week)
Resource A
Resource B
Resource C
09-04-13 PAGE 3
4. Context - example
• The type of building might influence the path the
instance follows in the process.
• Instances might compete for the same resources.
How to get insights into context-related
information based on event logs?
• The way people work together might affect the
process.
• The weather might influence the way organizations
handle cases.
09-04-13 PAGE 4
6. Fixing timestamps
An event may have (1) missing timestamp or (2) erroneous
timestamp needs to be fixed.
•Example of missing timestamp:
Case id -Registration
-Timestamp: 1 -Eligibility check -Archive
-Timestamp: 8 8
-Timestamp: ┴
1
Legend:
Resource A
Timestamp Resource B
Resource C
09-04-13 PAGE 6
7. Fixing timestamps
• Example of erroneous timestamp:
Case id Environmental check
Timestamp: 1
2 Eligibility check Archive
Registration
Timestamp: 2 Timestamp: 7 Timestamp: 8
3
Legend:
Timestamp Resource A
Resource B
Resource C
Case id Environmental check
Timestamp: 10
7 Eligibility check Archive
Registration
Timestamp: 2 Timestamp: 7 Timestamp: 8
3
Timestamp
?
09-04-13 PAGE 7
9. Deriving context
Enrich all events in event logs with new attributes in two ways:
• Deriving internal context
Enriched attributes are obtained from within event logs
itself.
• Deriving external context
Enriched attributes are obtained from additional external
sources.
09-04-13 PAGE 9
10. Deriving internal context
1 Filter
e1 e2 Unique originator in a trace
e8
2 Partition e9
Archive
e3 e5 e
e6 6 Resource: C
e4 3 Selection Registration Eligibility check
e7 Case
e8 Resource: A Resource: B
e8 id
1
Return
Registration
Post-process value to Archive
Resource: A
focus event Resource: C
4 Post-process 2
Legend
Material checking
Log
e Resource: A
Event in a log
Unique originator: 2
Part of log which filtered out Timestamp
Red partition 1. Filtering: none
Yellow partition
2. Partition: case id
Pink partition
3. Selection: running case
Focus event
Selected event 4. Post process: unique originator
09-04-13 PAGE 10
11. Deriving external context
• Two ways of deriving external context:
Range Value
• Range (time (season)
-Registration -Eligibility check stamp)
-Archive
Case id -Timestamp: 1 -Timestamp: 4 -Timestamp: 7 0-2 Winter
-Season: Winter -Season: Spring -Season: Summer
3-5 Spring
1 6-8 Summer
Timestamp
• Value mapping Legend:
Event
-Resource: A -Resource: C
Key Value
-Resource: C (resou (blood
Case id -Timestamp: 1 -Timestamp: 4 -Timestamp: 7
-Blood type: O -Blood type: AB rce) type)
-Blood type: AB
A O
1
C AB
Timestamp
09-04-13 PAGE 11
13. Context analyzer – mapping events to
visualization
Activity: A Activity: A Activity: B
Case id
NumberResource:1Resource:1 Resource:2
Activity
Timestamp (categorical)
1
A
Activity: D Activity: A Activity: C
Resource:2 Resource:1 Resource:1
B
2
Activity: A Activity: B C
Resource:3 Resource:2
3 D
1 1,5 2 2,5 3
NumResource
(numerical)
X-axis: numResource
Y-axis: activity
Coloring: activity
09-04-13 PAGE 13
14. Context analyzer – the inner rectangle
Activity
(categorical)
A
Number of events = 3
B
Number of events = 2
Number of events = 1 C
D
1 1,5 2 2,5 3
No event NumResource
(numerical)
Inner rectangle size is linear to number of events inside the cell.
09-04-13 PAGE 14
15. Context analyzer – mapping events to
visualization
Activity: A Activity: A Activity: B
Case id
NumResource:1 NumResource:1 NumResource:2
1 Case id
(categorical)
Activity: C Activity: C Activity: D
NumResource:1 NumResource:1 NumResource:1
2 1
Activity: A Activity: B
NumResource:1
2
NumResource:2
3 3
1 2 3
Timestamp NumResource
X-axis: numResource
Y-axis: activity
Coloring: activity
09-04-13 PAGE 15
17. Performance based on number of events
X-axis: timestamp
Y-axis: case id
Number of
columns: 7
Number of rows:
13087
The increase in the number of events has minimum effect on
processing time.
09-04-13 PAGE 17
18. Performance based on number of rows and
columns
X-axis: timestamp
Y-axis: timestamp
09-04-13 PAGE 18
19. Comparison between the dotted chart and the
context analyzer
Context analyzer
Dotted chart X-axis: timestamp
X-axis: timestamp Y-axis: case id
Y-axis: case id Coloring: number of unique originators
Coloring: originator
09-04-13 PAGE 19
20. Conclusion
• Meaningful context-related information can be derived using
internal and additional external data.
• The context analyzer provides more flexibility.
• The context analyzer has a better performance.
09-04-13 PAGE 20
Editor's Notes
Tanggal dan
Compare betweeen permit application. Make garage waiting too long. Gemeente busy Sisi pandang aplikan business process yang buruk. Gemeente melihat secara kesuluruhan Permasalahan yang dibahas di thesis ini adalah menggunakan analisis secara efisien Event didefiniskan lebih jelas Kasih keterangan tiap event Analisis dampak lingkungan di 20 th floor Ganti instance dengan case id Jadi instance ketiga berulang kali Bullet registration: misal resource A, instance dengan resource A, eligibility check dengan resource A. Legend untuk bullet Intstance kedua tidak mempengaruhi instance lain Acitivity jangan banyak2, kurangi dot Instance pertama lama banyak, eligibiitlity check lama sekali disana, tapi klo kita liat lintas instance kita bisa narik analisis lain resource dib anyak instance, kita bisa bilang analisis instance dilakukan hanya melihat satu kasus instance. Ini adalah salah satu contoh teknik analisis untuk menganislis performance, dari contoh ini ktia tidak mendapatkana informasi yg dibutuhkan. Ini merupakan teknik yang digunakan oleh dotted chart untuk menganilisis performance. Tiap event merepresentasikan kasus sebenarnya, axis diterangkan. Resource A punya beban terlalu banyak. Visualisasi sekarang ini kita bisa melihat menarik analisis, relatif sulit, karena kita menganalisis secara manual untuk menginspeksi. Kasih effect Goal mendapatkan context analisis berdasar event log How to derive context …. ? Panahnya untuk archive langsung 3
Pakai slide overview Gimana sistemati untuk mendapat insight untuk mendapat context . Timestamp salah satu atribut yang paling penting. Kenyataan event log enggak perfect, ada salah dan missing timestmap. Proposemetode untuk memperbaiki timestamp. Propse general framework derive context. Visualiasasis setelah di enrich Untuk mendapat insight into process Gunakan blackbox tuk dapet ketiga-tiganya. Context analyzer diganti dengan gambar sesungguhnya, jadi audience ngerti Pakai gambar framework, gambar di bab 3 general framework thesis 3.13 Cara ngejelasin overview. Cerita seolah-olah ada overview kita liat untuk menderive konteks secara sistematis ada data yang sangat penting, timestamp ada masalah, pakai effect, bakal dapat event log saya propose suatu yang enrich dengan context. Visualiasisi dengan output. Ini juga untuk overview presentasi saya Hasilnya juga ditulis disitu, dan pake silinder juga
Pakai satu trace figure 3.1, pakai 3 event saja. Slide berikutnya erroneus pakai contoh simple Contoh event pakai bullet-bullet saja. Masalah pertama ada satu trace yang ada 3 event, dengan atribut x, tapi timestamp g ada sama sekali. Fokus di satu slide satu masalah saja. Satu slide dengan Contoh missing timestamp Gambar dibuang Fix timestamp dengan 2 slide Erroneus timestamp bisa pake satu contoh
Deriving context ada 2 internal dan external dengan deskripsi Atribut ditambahkan untuk setiap event yang ada di log. Dengan contoh visualiasi di event log halaman kedua.
Jelasin ada pembagian ada 2, duration dan value mapping. Contoh aturannya. Dengan tipe data range. Atribut mapping, file csv Contoh dengan satu slide untuk 2 kasus.
Categorical Ato ganti jadi numerical Atributnya Panahnya 3 masuk tanpa menghilangkan warna merah
Tambahi keterangan inner rectangle size = jumlah event berbanding lurus Banyaknya event, yang item tidak ada event Add with categorical with cross
Tujuan Gimana inputnya Parameternya apa Jumlah kolomnya 7. Scattered chart. Kurva regresi. Comparison with dotted chart Waktu visualisasi tidak bergantung pada Ini juga direport
Confidence interval 30 kali eksperimen. Program off….
We managed to derive meaningful context-related information using internal and additional external data. Context analyzer provides an easy way to visualize context-related information such that useful insights can be easily obtained. Dengan pendekatan interval dan external konteks. A flexible visualization can give more information than dotted chart. Performance is better in a aggregated data, which already applied in event log.
Sebutkan keterbatasan. Kalau enggak valid gimana memunculkan social context dengan setting yang sudah ada. Keterbatasan implementasi can be improved dengan data yang besar. Pake struktur matriks data yang sparse, list of list. Departemen-departemen dan throughput time