S-Cube Learning PackageAnalyzing Business Process Performance    Using KPI Dependency Analysis University of Stuttgart (US...
Learning Package Categorization                       S-Cube                Adaptable Coordinated                 Service ...
Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions
Let’s Consider a Scenario (1) Assume we have implemented a business process as a  service orchestration It is a reseller...
Let’s Consider a Scenario (2) We are interested in measuring the performance of the  business processes (time, cost, qual...
Let’s Consider a Scenario (3) In the first step, KPIs are monitored at process runtime for a  set of process instances (w...
Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions
Architectural Overview (1)
Architectural Overview (2) Model and deploy the business process (e.g., in WS-BPEL ) Define and monitor a set of KPIs an...
Background:Event-Based Monitoring In order to be used for analysis, runtime data needs to be  monitored Event-based moni...
Background:Monitoring of Service Orchestrations Our monitoring approach for service orchestrations supports:    – Process...
KPI Dependency Analysis - Motivation So far we are able to monitor metrics and find out which KPI targets are  violated (...
Background:Machine Learning and Data Mining Automated discovery of interesting patterns from large  amounts of data (stor...
Background:Classification Learning (1) Given: a (historical) dataset containing a set of instances described in  terms of...
Background:Classification Learning (2)                                                   Classification                   ...
Background: Decision Tree Learning                                                                                      Ou...
KPI Dependency Analysis The KPI class of a process instance (alt. choreography instance, activity  instance, business obj...
Defining KPIs The KPI Definition includes:   – KPI metric definition (e.g., order fulfillment time)   – A set of categori...
Generating Metric Definitions A set of metric definitions (representing potential influential factors) can be  automatica...
Data Preparation and Learning Create a KPI Analysis Model    – Select KPI + a set of potential influential factors Gathe...
KPI Dependency Analysis                         Monitor Model                                                           KP...
Prototype Implementation                            Prototype is based on…                            Apache ODE (BPEL e...
Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions                          ...
Experimental Results Generated tree for KPI = order fulfillment time (J48 algorithm) Contains the expected influential m...
Experimental Results:Drill-Down Generated tree for KPI = “order in stock” (J48 algorithm) Here, we perform “drill-down” ...
Experiment Results:Differences between Algorithms We have experimented with J48 and ADTree and generated trees for  diffe...
Experiment Results:Tree Size Trees are getting bigger with the number of process instances   – J48 generated for 400 inst...
Experiment Results:Tree Size (2) To improve the readability…   – usage of parameters has lead to only marginal changes in...
Some Important Earlier Work We were not the first ones to have similar ideas Important earlier work includes:  Castellan...
Main Advances Over Earlier Work The S-Cube approach to KPI analysis based on event logs  improves on earlier work in some...
Discussion - Advantages The KPI Dependency Analysis based on decision trees has a  number of clear advantages …   – Simpl...
Discussion - Disadvantages … but of course the approach also has some disadvantages.   – Bootstrapping problem – the appr...
Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions                          ...
Summary Classification learning based techniques can be used to  explain performance problems in service compositions St...
Further S-Cube ReadingWetzstein, Leitner, Rosenberg, Brandic, Dustdar, and Leymann. Monitoring and Analyzing Influential F...
Acknowledgements      The research leading to these results has      received funding from the European      Community’s S...
Upcoming SlideShare
Loading in...5
×

S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

739

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
739
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis

  1. 1. S-Cube Learning PackageAnalyzing Business Process Performance Using KPI Dependency Analysis University of Stuttgart (USTUTT), TU Wien (TUW) Branimir Wetzstein, USTUTT www.s-cube-network.eu
  2. 2. Learning Package Categorization S-Cube Adaptable Coordinated Service Compositions Adaptable and QoS-aware Service Compositions Analyzing Business Process Performance Using KPI Dependency Analysis
  3. 3. Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions
  4. 4. Let’s Consider a Scenario (1) Assume we have implemented a business process as a service orchestration It is a reseller process which interacts with external services of customer, suppliers, bank, shipper, and internal services such as the warehouse etc.
  5. 5. Let’s Consider a Scenario (2) We are interested in measuring the performance of the business processes (time, cost, quality, customer satisfaction) This is done by defining Key Performance Indicators (KPIs) which specify target values on key metrics based on business goals – KPI target value function maps metric value ranges to KPI classes (e.g., “good”, “medium”, “bad”) Some typical KPI metrics in our scenario – Order Fulfillment Lead Time – Perfect Order Fulfillment (in time and in full) – Customer Complaint Rate – Availability of the reseller service
  6. 6. Let’s Consider a Scenario (3) In the first step, KPIs are monitored at process runtime for a set of process instances (what?) If monitoring of KPIs shows unsatisfying results, we want to be able to analyze and explain the violations (why?) That is not trivial as a KPI often depends on many influential factors measured by lower-level metrics PPMs Avail.in Stock, Customer, Products, … Purchase Order Process is measured by Order Fulfillment Lead Time QoS Service Availability, Response Time
  7. 7. Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions
  8. 8. Architectural Overview (1)
  9. 9. Architectural Overview (2) Model and deploy the business process (e.g., in WS-BPEL ) Define and monitor a set of KPIs and potential influential metrics – Event-based monitoring based on CEP – Supporting in particular both process events and QoS events and their correlation Train a decision tree (KPI Dependency Tree) from monitored data – Gather monitored data from Metrics DB – Classify the monitored process instances according to their KPI class – Use Decision Tree Learning Algorithms to learn the dependencies of the KPI and the lower-level metrics
  10. 10. Background:Event-Based Monitoring In order to be used for analysis, runtime data needs to be monitored Event-based monitoring is an often-used idea to implement this Basic principle: Register for and receive some lifecycle events from the service composition and use Complex Event Processing (CEP) to extract, correlate and aggregate monitoring data from raw event data Can be used to monitor both QoS and domain-specific data
  11. 11. Background:Monitoring of Service Orchestrations Our monitoring approach for service orchestrations supports: – Process Performance Metrics (PPMs) based on process events (BPEL event model) – QoS metrics based on QoS events provided by QoS monitors – Correlation of Process events and QoS events – Metric calculation based on Complex Event Processing (ESPER) Listener Service Complex Event Dashboard Event Processing Metric definitions Process Engine Event Metrics Database QoS Monitor
  12. 12. KPI Dependency Analysis - Motivation So far we are able to monitor metrics and find out which KPI targets are violated (what?) In the next step, we want to explain the violations (why?) That is not trivial as a KPI often depends on many influential factors measured by lower-level metrics Typically, such an analysis is done manually (if at all) by a business analyst using OLAP queries on a data warehouse  that is very cumbersome and time-consuming  we want to “discover” the problems in an automated way therefore we can use data mining techniques In particular, we construct a classification problem and use existing classification learning techniques (decision trees)
  13. 13. Background:Machine Learning and Data Mining Automated discovery of interesting patterns from large amounts of data (stored typically in data warehouses) – Manual discovery (e.g., by using OLAP queries) could take days or weeks Functionalities include: – Mining of association rules, correlation analysis – Classification and Prediction  our focus here! – Clustering – Time-series analysis – Graph mining and text mining Interdisciplinary field using techniques from machine learning, statistics, pattern recognition, data visualization
  14. 14. Background:Classification Learning (1) Given: a (historical) dataset containing a set of instances described in terms of: – A set of explanatory (a.k.a. predictive) categorical or numerical attributes – a categorical target attribute (a.k.a. class) Goal: based on the historical dataset (“supervised learning”) create a classification model which helps… – explaining the dependencies between the class and the explanatory attributes in history data (interpretation) – making predictions about future data; i.e. based on future explanatory attribute values predict the class (prediction) Some Classification Learning techniques: – Decision Trees – Classification Rules – Support Vector Machines
  15. 15. Background:Classification Learning (2) Classification Algorithm Training Data Training Phase Test Data Test Phase Classification New Data Model Explanatory Attribute Values Prediction Phase Interpretation Phase Predicted Class Knowledge
  16. 16. Background: Decision Tree Learning Outgoing edges representA non-leaf node represents conditions on the parentan explanatory categorical A1 explanatory attribute valuesor numeric attribute; <2 >4 2 < x <4 C4 A2 A2 80/2 yes no yes no A leaf node represents a C1 C1 target attribute class . 50 A3 30 A4 <3 >3 <1 >1 C2 C3 C1 C3 A path shows which 20/1 20 5 10/1 attribute values lead to a certain class. The leaf node shows the corresponding number of instances from the training set.
  17. 17. KPI Dependency Analysis The KPI class of a process instance (alt. choreography instance, activity instance, business object, …) depends on a set of influential factors (PPMs and QoS metrics) For finding out those dependencies, we use classification learning: – The data set consists of a set of (historical) process instances; for each process instance the KPI class and a set of metrics is evaluated – The KPI is the target attribute which maps values of the underlying metric to categorical values (KPI classes) – The potential influential lower-level metrics are the explanatory attributes (predictive variables) – Goal: Based on a set of monitored instances, create a classification model (decision tree) which identifies recurring relationships among the explanatory attributes which describe the instances belonging to the same KPI class The decision tree (KPI dependency tree) can be used to explain KPI classes of past process instances and also to predict the class of process instances for which only the values of some of the lower-level metrics are are known
  18. 18. Defining KPIs The KPI Definition includes: – KPI metric definition (e.g., order fulfillment time) – A set of categorical values defining the KPI classes, at least 2 (e.g., “green”, “yellow”, “red”) – Target value function mapping KPI metric values to KPI classes (e.g., m < 2 days  green, 2 days < m < 4 days  yellow, otherwise red) The KPI metric is specified for a monitored entity type: – Process Instance (e.g., duration of a reseller process instance) – Activity Instance (e.g., duration of the supplier service invocation) – Choreography Instance (e.g., duration – Service endpoint (e.g., availability) – Set of Process Instances per day (e.g., average duration)
  19. 19. Generating Metric Definitions A set of metric definitions (representing potential influential factors) can be automatically generated We support rules to generate automatically the following metrics: – Service invocation: - availability and response time of invoked service (both for synchronous and asynchronous invocations (invoke-receive)) – WS-BPEL invoke activity (other basic activities not interesting for long-running processes): - execution time of the activity (i.e. the time between starting and finishing the activity) - If part of a loop, in addition: - Average/minimum/maximum execution time per process instance - number of executions per process instance – For every branching activity, fault, compensation and event handlers, we generate a metric representing the branch that has been executed Metrics based on process variable data elements are created manually
  20. 20. Data Preparation and Learning Create a KPI Analysis Model – Select KPI + a set of potential influential factors Gather metric values of monitored entity instances and create a training set: – Each monitored entity instance with ist KPI class and influential metric values maps to a row in the training set A decision tree is learned (e.g., using the J48 algorithm)
  21. 21. KPI Dependency Analysis Monitor Model KPI Analysis Model Choreo. Order Fulfillment Lead Time, Metric: Order Fulfillment Lead Time Level: Delivery Time Shipment,… Target < 5 days  green Value: >= 5days  red Orch. Order In Stock, Delivery Time Analyzed Time window: last 2 months Level: Supplier, Order amount, Instances: Filter: customerType=“gold“ Designate a Packaging time… Metric as KPI Metric Set: M={orchestration.all, qos.all} Service Process Infr. Av., response time Level: banking service, … Algorithm: Classification Tree -J48Design time MetricRuntime values KPI Delivery Time definition Shipment KPI Supplier Order Process <2 >4 … 2 < x <4 Deliv. Time In Stock Availability Order In Order In red Red 28 h No 1,00 … Stock? Stock? 80/2 yes no yes no Green N/A yes 0,84 … Decision green Delivery Time green Delivery Time Red 32 h No 0,9 … Tree Supplier Supplier 50 30 Learning <3 >3 <1 >1 ……. ……. ……. ……. … green red green red 20/1 20 5 10
  22. 22. Prototype Implementation  Prototype is based on…  Apache ODE (BPEL execution engine) – Publishes events to JMS topics – Standalone QoS monitor evaluates QoS metrics of services  Monitoring Tool – Based on ESPER CEP Framework – Metrics DB in MySQL – Bam Dashboard as Java Swing Application  Process Analyzer – Uses WEKA Machine Learning toolkit
  23. 23. Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions © Philipp Leitner
  24. 24. Experimental Results Generated tree for KPI = order fulfillment time (J48 algorithm) Contains the expected influential metrics in a satisfactory manner and produce suitable results ‘out of the box’ In our setting on a standard laptop computer a decision tree generation based on 1000 instances takes about 30 sec
  25. 25. Experimental Results:Drill-Down Generated tree for KPI = “order in stock” (J48 algorithm) Here, we perform “drill-down” analysis by setting the metric “order in stock” as KPI We want to understand which factors have an influence on whether the order cannot be processed from stock
  26. 26. Experiment Results:Differences between Algorithms We have experimented with J48 and ADTree and generated trees for different numbers of process instances (100, 400, 1000) – ADTree algorithm produces bigger trees than J48 (third column: number of leaves and nodes) for the same number of instances. However, it also reaches a higher precision (last column: correctly classified instances). – Both algorithms show very similar results concerning the displayed influential metrics. Typically there is only one or at most two (marginal) metrics which differ.
  27. 27. Experiment Results:Tree Size Trees are getting bigger with the number of process instances – J48 generated for 400 instances a tree with 11 nodes, for 1000 instances a tree with 18 nodes, while the precision improved only by 1% – When the tree gets bigger, factors are shown in the tree which have only marginal influence and thus make the tree less readable (‘Displayed Metrics’ shows how many distinct metrics are displayed in the tree)
  28. 28. Experiment Results:Tree Size (2) To improve the readability… – usage of parameters has lead to only marginal changes in our experiments (for example, J48 -U with no pruning). The only parameter that turned out useful to reduce the size of the tree was ‘reduced error pruning’ (J48 -R) – Another option, in the case of too many undesirable (marginal) metrics, is to simply remove those metrics from the potential influential factor metric set and repeat the analysis
  29. 29. Some Important Earlier Work We were not the first ones to have similar ideas Important earlier work includes: Castellanos, M., et al., 2005. iBOM: a platform for intelligent business operation management. In: Proceedings of the 21st international conference on data engineering (ICDE005). Washington, DC: IEEE Computer Society, 1084–1095 M. Castellanos, F. Casati, U. Dayal, and M.-C. Shan, “A Comprehensive and Automated Approach to Intelligent Business Processes Execution Analysis,” Distributed and Parallel Databases, vol. 16, no. 3, pp. 239–273, 2004.
  30. 30. Main Advances Over Earlier Work The S-Cube approach to KPI analysis based on event logs improves on earlier work in some important aspects: – KPI Dependency Analysis incorporates both process-level metrics and QoS metrics – Semi-automated generation of potential influential metric definitions for WS-BPEL processes – Many different algorithms can be used for analysis - Courtesy of the WEKA backend
  31. 31. Discussion - Advantages The KPI Dependency Analysis based on decision trees has a number of clear advantages … – Simplicity – the principle approach is relatively easy to understand; the generated trees can be understood also by non-IT users – Efficiency – the analysis of influential factors is “automated”; the traditional approach is to manually pose analysis questions by using OLAP queries over data marts which is much more time-consuming – Proven in the real world – machine learning is by now a proven technique that has been successfully applied in many areas
  32. 32. Discussion - Disadvantages … but of course the approach also has some disadvantages. – Bootstrapping problem – the approach assumes that some recorded historical event logs are available for training – Necessary domain knowledge – in order to define the potential influential metric set some domain knowledge is necessary – Availability of monitoring data – one of the basic assumptions of the approach is that all necessary data can be monitored (if this is not the case the approach cannot be used)
  33. 33. Learning Package Overview Problem Description KPI Dependency Analysis Discussion Conclusions © Philipp Leitner
  34. 34. Summary Classification learning based techniques can be used to explain performance problems in service compositions Steps: 1. Define a KPI and a set of potential influential metrics 2. Monitor all metrics for a set of process instances 3. Train a decision tree from historical event log The created KPI dependency tree explains the dependencies of the KPI classes and a set of lower level process metrics and QoS metrics
  35. 35. Further S-Cube ReadingWetzstein, Leitner, Rosenberg, Brandic, Dustdar, and Leymann. Monitoring and Analyzing Influential Factorsof Business Process Performance. In Proceedings of the 13th IEEE international conference on EnterpriseDistributed Object Computing (EDOC09). IEEE Press, Piscataway, NJ, USA, 118-127.Wetzstein, Branimir; Leitner, Philipp; Rosenberg, Florian; Dustdar, Schahram; Leymann, Frank: IdentifyingInfluential Factors of Business Process Performance Using Dependency Analysis. In: Enterprise InformationSystems. Vol. 5(1), Taylor & Francis, 2010.Kazhamiakin, Raman; Wetzstein, Branimir; Karastoyanova, Dimka; Pistore, Marco; Leymann, Frank:Adaptation of Service-Based Applications Based on Process Quality Factor Analysis. In: Proceedings of the2nd Workshop on Monitoring, Adaptation and Beyond (MONA+), co-located with ICSOC/ServiceWave 2009.Leitner, Wetzstein, Rosenberg, Michlmayr, Dustdar, and Leymann. Runtime Prediction of Service LevelAgreement Violations for Composite Services. In Proceedings of the 2009 International conference onService-Oriented Computing (ICSOC/ServiceWave09), Springer-Verlag, Berlin, Heidelberg, 176-186.Leitner, Michlmayr, Rosenberg, and Dustdar. Monitoring, Prediction and Prevention of SLA Violations inComposite Services. In Proceedings of the 2010 IEEE International Conference on Web Services (ICWS 10).IEEE Computer Society, Washington, DC, USA, 369-376.
  36. 36. Acknowledgements The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] under grant agreement 215483 (S-Cube). © Philipp Leitner
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×