The Challenges of Using Process
Mining in Internal Audit
Xhentilo Karaj
Senior Data Analyst
Introduction
Euroclear group is the financial industry’s trusted provider of post-trade services.
2
Internal Audit
Business
Understanding
Fieldwork Clearance Report
3
Internal Audit provides independent reasonable assurance and insight on
governance, risk management and internal controls, to add value to and support
the organization in achieving its objectives.
Business Understanding Checklist
ØProcess is supported by an information system
ØExistence of the event log table/s
ØHuman-performed actions
ØAvailable process documentation
4
Process Mining Journey
Business
Understanding Fieldwork Clearance Report
5
Data
collection /
extraction
Data
preprocessing
Process
Mining
analysis
Process Mining
Data Collection Challenge 1
• Auditors should provide independent assurance/review.
► Audit data analysts need direct access to the source system.
► Learning curve is expected.
• Tip: Plan additional budget in case of being onboarded to a new IT system.
6
Data Collection Challenge 2
• Data volume:
► Business processes that have tens / hundreds of millions of records in
scope.
• Tip: Extract data in chunks.
7
Data Collection Challenge 3
• Data completeness:
► Typical example: incomplete activity timestamp (lacking hours/minutes).
– Problem: sequence of activities.
► Usually occurs with SaaS / cloud-based applications.
• Workaround:
► Order the data by an id column (if available), and the process mining
tool will recognize the order.
8
Data extraction methods
9
Relational
databases,
Data Lakes
SQL queries
REST APIs Web-based apps,
cloud-based apps
Manual data export Application UI
Process Mining Journey
Business
Understanding Fieldwork Clearance Report
10
Data
collection /
extraction
Data
preprocessing
Process
Mining
analysis
The Example Process – Client Service
Email from
client
New Case
Assigned
In Progress
Email to client
Closed
11
Data Preprocessing
Case
ID
Case
Created
Time
Log ID Field Old
Value
New
Value
Timestamp
245 2021-03-
06
17:21:09
43444 Status New In
Progress
2021-03-07
09:43:56
245 2021-03-
06
17:21:09
43994 Status In
Progress
Closed 2021-03-08
14:12:05
12
Status Changes Log
Data Preprocessing
Case
ID
Case
Created
Time
Log ID Field Old
Value
New
Value
Timestamp
245 2021-03-
06
17:21:09
43444 Status New In
Progress
2021-03-07
09:43:56
245 2021-03-
06
17:21:09
43994 Status In
Progress
Closed 2021-03-08
14:12:05
13
Problem: The initial status value is not part of the log.
Data Preprocessing
Case
ID
Case
Created
Time
Log ID Field Old
Value
New
Value
Timestamp
245 2021-03-
06
17:21:09
43444 Status New In
Progress
2021-03-07
09:43:56
245 2021-03-
06
17:21:09
43994 Status In
Progres
s
Closed 2021-03-08
14:12:05
14
Problem: The initial status value is not part of the log. Solution: Add a new log entry for the initial status
(timestamp is the creation time).
Case ID Log ID Activity Timestamp
245 Auto-
gen
New 2021-03-06
17:21:09
245 43444 In
Progress
2021-03-07
09:43:56
245 43994 Closed 2021-03-08
14:12:05
New
2021-03-
06
17:21:09
15
16
Status changes event log
Case
ID
Log ID Activity Timestamp
245 Auto-
gen
New 2021-03-06
17:21:09
245 43444 In
Progress
2021-03-07
09:43:56
245 43994 Closed 2021-03-08
14:12:05
Problem: Apart from the Status log, changes of the
Assignee contain activity information. How to proceed?
Case
ID
Log
ID
Field Old
Value
New
Value
Timestamp
245 4339
8
Assigned
To
User A 2021-03-07
09:05:00
245 4344
1
Assigned
To
User A User B 2021-03-07
09:38:09
Case ID Log ID Activity Old Value New Value Timestamp
245 Auto-
gen
New 2021-03-06
17:21:09
245 43398 Assigned User A 2021-03-07
09:05:00
245 43441 Assigned User A User B 2021-03-07
09:38:09
245 43444 In Progress 2021-03-07
09:43:56
245 43994 Closed 2021-03-08
14:12:05
17
Sender Receiver CC Email Subject Email Body Case ID Timestamp
client@gmail.co
m
generic@eurocl
ear.com
Broken
functionality
Dears, we
contact
concerning an
issue…
245 2021-03-06
17:21:08
person1@eurocl
ear.com
person2@eurocl
ear.com
person3@e
uroclear.co
m
Ticket 245 Dear, can you
please provide
support on…
245 2021-03-07
16:12:43
person2@eurocl
ear.com
client@gmail.co
m
person3@e
uroclear.co
m,
generic@e
uroclear.co
m
Broken
functionality
Dear, please
note that issue
is resolved…
245 2021-03-08
14:11:01
18
Email info Case ID Timestamp
Email from client 245 2021-03-06
17:21:08
Internal email 245 2021-03-07
16:12:43
Email to client 245 2021-03-08
14:11:01
Data transformation (leverage regular expressions)
19
Email info Case ID Timestamp
Email from client 245 2021-03-06
17:21:08
Internal email 245 2021-03-07
16:12:43
Email to client 245 2021-03-08
14:11:01
Case ID Log ID Activity Old Value New Value Timestamp
245 Auto-gen Email from
client
2021-03-06 17:21:08
245 Auto-gen New 2021-03-06 17:21:09
245 43398 Assigned User A 2021-03-07 09:05:00
245 43441 Assigned User A User B 2021-03-07 09:38:09
245 43444 In Progress 2021-03-07 09:43:56
245 Auto-gen Internal email 2021-03-07 16:12:43
245 Auto-gen Email to client 2021-03-08 14:11:01
245 43994 Closed 2021-03-08 14:12:05
Standard event log
Emails dataset
Case
ID
Log ID Activity Old Value New Value Timestamp
245 Auto-
gen
New 2021-03-06
17:21:09
245 43398 Assigned User A 2021-03-07
09:05:00
245 43441 Assigned User A User B 2021-03-07
09:38:09
245 43444 In
Progress
2021-03-07
09:43:56
245 43994 Closed 2021-03-08
14:12:05
Other data preprocessing operations
• Event log with incomplete timestamp value.
• Event log needs to be built from multiple data objects.
• …
Data preprocessing conclusion: No free-lunch!
20
Process Mining Journey
Business
Understanding Fieldwork Clearance Report
21
Data
collection /
extraction
Data
preprocessing
Process
Mining
analysis
Process Mining Analysis – Expected Process
Email from
client
New Case
Assigned
In Progress
Email to client
Closed
22
23
1. Tickets not moved to In
Progress.
Disclaimer: This process map was
generated with simulated data.
24
2. Re-opened tickets
Disclaimer: This process map was
generated with simulated data.
25
3. Multiple changes of
assignees.
Disclaimer: This process map was
generated with simulated data.
26
4. Multiple subsequent emails
from the client (without response
in between).
Disclaimer: This process map was
generated with simulated data.
27
5. Cases that do not contain an
email to client before closing the
ticket.
Disclaimer: This process map was
generated with simulated data.
28
1. Tickets not
moved to In
Progress.
2. Re-opened tickets
3. Multiple
changes of
assignees.
4. Multiple subsequent emails
from the client (without response
in between).
5. Cases that do not contain an
email to client before closing the
ticket.
Issue Type Issue #
Conformance 1, 5
Reporting 1
Inefficiency 2, 3, 4
Timeliness 4
Disclaimer: This process map was
generated with simulated data.
Classical Audit vs Process Mining
29
Classical Audit Approach Process Mining
Sampling, interviewing,
documentation analysis
Full population testing
Limited assurance (due to
statistical testing)
“Almost” full assurance
Might not identify all issues Can spot issues that wouldn’t be
identified through sampling
Slow & manual Quick & automated
Process Mining Journey
Business
Understanding Fieldwork Clearance Report
30
Data
collection /
extraction
Data
preprocessing
Process
Mining
analysis
31
Auditors Business representatives
(team leaders, middle mgmt.)
Ø Discuss identified issues.
Ø Define the priority of the issue (based on volume of cases, criticality, auditor’s judgement).
Ø Agree on the actions and action owners.
Clearance
32
Issue Issue Type Priority Action
Tickets not moved to In
Progress
Conformance,
reporting
High New system control
Multiple changes of
assignees
Inefficiency Medium Staff training,
Improve definition of
responsibilities and skills
Reopened tickets Inefficiency Medium Staff training
Multiple subsequent
emails from the client
Inefficiency, timeliness Not an issue None
Cases that do not
contain an email to
client before closing
the ticket.
Conformance High New system control
Clearance
33
Process mining impact:
Ø In principle, clearance should be smoother (due to fact-based observations).
Ø Our experience in Euroclear is proof of that.
Ø Perfect end-result (on certain cases): Business develops process monitoring solutions.
Process Mining Journey
Business
Understanding Fieldwork Clearance Report
34
Data
collection /
extraction
Data
preprocessing
Process
Mining
analysis
Reporting
35
Project Finalization
Audit Report
Ø Attach the process mining results
into the final audit report
(preferably in the issues section).
Ø It will trigger management’s
awareness.
Project Archiving
Ø Archive the solution (code, input
files, process mining project files,
result files, docs) for future usage.
The Future
• We are J about using process mining.
• The dream is to cover all processes of the company that are
supported by any IT application.
• Process mining will be crucial in our quest towards the continuous
auditing initiative.
36
Key Takeaways
• Enrich your analysis with datasets other than the status event log.
• Expect to spend more time in data preprocessing.
• Auditor’s judgement & domain knowledge is key for identifying the real risk.
• Process mining does not replace auditors’ job.
37
38
Thank you!
The challenges of using process mining in internal audit

The challenges of using process mining in internal audit

  • 1.
    The Challenges ofUsing Process Mining in Internal Audit Xhentilo Karaj Senior Data Analyst
  • 2.
    Introduction Euroclear group isthe financial industry’s trusted provider of post-trade services. 2
  • 3.
    Internal Audit Business Understanding Fieldwork ClearanceReport 3 Internal Audit provides independent reasonable assurance and insight on governance, risk management and internal controls, to add value to and support the organization in achieving its objectives.
  • 4.
    Business Understanding Checklist ØProcessis supported by an information system ØExistence of the event log table/s ØHuman-performed actions ØAvailable process documentation 4
  • 5.
    Process Mining Journey Business UnderstandingFieldwork Clearance Report 5 Data collection / extraction Data preprocessing Process Mining analysis Process Mining
  • 6.
    Data Collection Challenge1 • Auditors should provide independent assurance/review. ► Audit data analysts need direct access to the source system. ► Learning curve is expected. • Tip: Plan additional budget in case of being onboarded to a new IT system. 6
  • 7.
    Data Collection Challenge2 • Data volume: ► Business processes that have tens / hundreds of millions of records in scope. • Tip: Extract data in chunks. 7
  • 8.
    Data Collection Challenge3 • Data completeness: ► Typical example: incomplete activity timestamp (lacking hours/minutes). – Problem: sequence of activities. ► Usually occurs with SaaS / cloud-based applications. • Workaround: ► Order the data by an id column (if available), and the process mining tool will recognize the order. 8
  • 9.
    Data extraction methods 9 Relational databases, DataLakes SQL queries REST APIs Web-based apps, cloud-based apps Manual data export Application UI
  • 10.
    Process Mining Journey Business UnderstandingFieldwork Clearance Report 10 Data collection / extraction Data preprocessing Process Mining analysis
  • 11.
    The Example Process– Client Service Email from client New Case Assigned In Progress Email to client Closed 11
  • 12.
    Data Preprocessing Case ID Case Created Time Log IDField Old Value New Value Timestamp 245 2021-03- 06 17:21:09 43444 Status New In Progress 2021-03-07 09:43:56 245 2021-03- 06 17:21:09 43994 Status In Progress Closed 2021-03-08 14:12:05 12 Status Changes Log
  • 13.
    Data Preprocessing Case ID Case Created Time Log IDField Old Value New Value Timestamp 245 2021-03- 06 17:21:09 43444 Status New In Progress 2021-03-07 09:43:56 245 2021-03- 06 17:21:09 43994 Status In Progress Closed 2021-03-08 14:12:05 13 Problem: The initial status value is not part of the log.
  • 14.
    Data Preprocessing Case ID Case Created Time Log IDField Old Value New Value Timestamp 245 2021-03- 06 17:21:09 43444 Status New In Progress 2021-03-07 09:43:56 245 2021-03- 06 17:21:09 43994 Status In Progres s Closed 2021-03-08 14:12:05 14 Problem: The initial status value is not part of the log. Solution: Add a new log entry for the initial status (timestamp is the creation time). Case ID Log ID Activity Timestamp 245 Auto- gen New 2021-03-06 17:21:09 245 43444 In Progress 2021-03-07 09:43:56 245 43994 Closed 2021-03-08 14:12:05 New 2021-03- 06 17:21:09
  • 15.
  • 16.
    16 Status changes eventlog Case ID Log ID Activity Timestamp 245 Auto- gen New 2021-03-06 17:21:09 245 43444 In Progress 2021-03-07 09:43:56 245 43994 Closed 2021-03-08 14:12:05 Problem: Apart from the Status log, changes of the Assignee contain activity information. How to proceed? Case ID Log ID Field Old Value New Value Timestamp 245 4339 8 Assigned To User A 2021-03-07 09:05:00 245 4344 1 Assigned To User A User B 2021-03-07 09:38:09 Case ID Log ID Activity Old Value New Value Timestamp 245 Auto- gen New 2021-03-06 17:21:09 245 43398 Assigned User A 2021-03-07 09:05:00 245 43441 Assigned User A User B 2021-03-07 09:38:09 245 43444 In Progress 2021-03-07 09:43:56 245 43994 Closed 2021-03-08 14:12:05
  • 17.
  • 18.
    Sender Receiver CCEmail Subject Email Body Case ID Timestamp client@gmail.co m generic@eurocl ear.com Broken functionality Dears, we contact concerning an issue… 245 2021-03-06 17:21:08 person1@eurocl ear.com person2@eurocl ear.com person3@e uroclear.co m Ticket 245 Dear, can you please provide support on… 245 2021-03-07 16:12:43 person2@eurocl ear.com client@gmail.co m person3@e uroclear.co m, generic@e uroclear.co m Broken functionality Dear, please note that issue is resolved… 245 2021-03-08 14:11:01 18 Email info Case ID Timestamp Email from client 245 2021-03-06 17:21:08 Internal email 245 2021-03-07 16:12:43 Email to client 245 2021-03-08 14:11:01 Data transformation (leverage regular expressions)
  • 19.
    19 Email info CaseID Timestamp Email from client 245 2021-03-06 17:21:08 Internal email 245 2021-03-07 16:12:43 Email to client 245 2021-03-08 14:11:01 Case ID Log ID Activity Old Value New Value Timestamp 245 Auto-gen Email from client 2021-03-06 17:21:08 245 Auto-gen New 2021-03-06 17:21:09 245 43398 Assigned User A 2021-03-07 09:05:00 245 43441 Assigned User A User B 2021-03-07 09:38:09 245 43444 In Progress 2021-03-07 09:43:56 245 Auto-gen Internal email 2021-03-07 16:12:43 245 Auto-gen Email to client 2021-03-08 14:11:01 245 43994 Closed 2021-03-08 14:12:05 Standard event log Emails dataset Case ID Log ID Activity Old Value New Value Timestamp 245 Auto- gen New 2021-03-06 17:21:09 245 43398 Assigned User A 2021-03-07 09:05:00 245 43441 Assigned User A User B 2021-03-07 09:38:09 245 43444 In Progress 2021-03-07 09:43:56 245 43994 Closed 2021-03-08 14:12:05
  • 20.
    Other data preprocessingoperations • Event log with incomplete timestamp value. • Event log needs to be built from multiple data objects. • … Data preprocessing conclusion: No free-lunch! 20
  • 21.
    Process Mining Journey Business UnderstandingFieldwork Clearance Report 21 Data collection / extraction Data preprocessing Process Mining analysis
  • 22.
    Process Mining Analysis– Expected Process Email from client New Case Assigned In Progress Email to client Closed 22
  • 23.
    23 1. Tickets notmoved to In Progress. Disclaimer: This process map was generated with simulated data.
  • 24.
    24 2. Re-opened tickets Disclaimer:This process map was generated with simulated data.
  • 25.
    25 3. Multiple changesof assignees. Disclaimer: This process map was generated with simulated data.
  • 26.
    26 4. Multiple subsequentemails from the client (without response in between). Disclaimer: This process map was generated with simulated data.
  • 27.
    27 5. Cases thatdo not contain an email to client before closing the ticket. Disclaimer: This process map was generated with simulated data.
  • 28.
    28 1. Tickets not movedto In Progress. 2. Re-opened tickets 3. Multiple changes of assignees. 4. Multiple subsequent emails from the client (without response in between). 5. Cases that do not contain an email to client before closing the ticket. Issue Type Issue # Conformance 1, 5 Reporting 1 Inefficiency 2, 3, 4 Timeliness 4 Disclaimer: This process map was generated with simulated data.
  • 29.
    Classical Audit vsProcess Mining 29 Classical Audit Approach Process Mining Sampling, interviewing, documentation analysis Full population testing Limited assurance (due to statistical testing) “Almost” full assurance Might not identify all issues Can spot issues that wouldn’t be identified through sampling Slow & manual Quick & automated
  • 30.
    Process Mining Journey Business UnderstandingFieldwork Clearance Report 30 Data collection / extraction Data preprocessing Process Mining analysis
  • 31.
    31 Auditors Business representatives (teamleaders, middle mgmt.) Ø Discuss identified issues. Ø Define the priority of the issue (based on volume of cases, criticality, auditor’s judgement). Ø Agree on the actions and action owners.
  • 32.
    Clearance 32 Issue Issue TypePriority Action Tickets not moved to In Progress Conformance, reporting High New system control Multiple changes of assignees Inefficiency Medium Staff training, Improve definition of responsibilities and skills Reopened tickets Inefficiency Medium Staff training Multiple subsequent emails from the client Inefficiency, timeliness Not an issue None Cases that do not contain an email to client before closing the ticket. Conformance High New system control
  • 33.
    Clearance 33 Process mining impact: ØIn principle, clearance should be smoother (due to fact-based observations). Ø Our experience in Euroclear is proof of that. Ø Perfect end-result (on certain cases): Business develops process monitoring solutions.
  • 34.
    Process Mining Journey Business UnderstandingFieldwork Clearance Report 34 Data collection / extraction Data preprocessing Process Mining analysis
  • 35.
    Reporting 35 Project Finalization Audit Report ØAttach the process mining results into the final audit report (preferably in the issues section). Ø It will trigger management’s awareness. Project Archiving Ø Archive the solution (code, input files, process mining project files, result files, docs) for future usage.
  • 36.
    The Future • Weare J about using process mining. • The dream is to cover all processes of the company that are supported by any IT application. • Process mining will be crucial in our quest towards the continuous auditing initiative. 36
  • 37.
    Key Takeaways • Enrichyour analysis with datasets other than the status event log. • Expect to spend more time in data preprocessing. • Auditor’s judgement & domain knowledge is key for identifying the real risk. • Process mining does not replace auditors’ job. 37
  • 38.