• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs






Total Views
Views on SlideShare
Embed Views



1 Embed 201

http://vc.infosys.tuwien.ac.at 201



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs Presentation Transcript

    • Exploiting Knowledge on Past Process Execution to Improve SBA Analysis Mining Lifecycle Event Logs for Enhancing SBAs ISTI-CNR (CNR), TU Wien (TUW) Franco Maria Nardini, Gabriele Tolomei, CNR
    • Learning Package Categorization S-Cube Monitoring and Analysis of SBA Process Mining Exploiting Knowledge on Past Process Execution to Improve SBA Analysis
    • Connections to the S-Cube IRF   Conceptual Research Framework: –  Service Composition and Coordination –  Service Infrastructure –  Adaptation and Monitoring   Logical Run-Time Architecture: –  Monitoring Engine –  Adaptation Engine –  Negotiation Engine –  Runtime QA Engine –  Resource Broker 3
    • Overview  Introduction  Goal  Methodology  Experiments  Conclusions
    • SBA Event Logs   Most complex software systems collect their lifecycle usage data in event log files   SBA event logs contain several information about service components exchanging messages –  e.g., service invocation, service failure, registry querying, etc.   Event logs represent a huge source of “hidden” information (i.e., knowledge) 5
    • Mining SBA Event Logs   Data Mining algorithms and techniques allow extracting valuable knowledge from event logs   Extracted knowledge may refer to several aspects: –  e.g., service usage patterns, service failure patterns, etc.   If properly exploited, such knowledge might help improving the overall quality of the system: –  recommending frequent invoked services; –  avoiding/handling anomalous situations, etc. 6
    • Process Mining (PM)   Process Mining (PM) is an application of data mining techniques to SBA event logs   PM aims at discovering structured process models derived from patterns that are present in actual traces of service executions   Each process is usually represented by a digraph and the problem of PM has been modeled as: –  finite state machine [CW96] –  sequential pattern mining (SPM) [AGL98] –  Petri-net [vdAWM04] 7
    • Another Example: Web Search Engines   Web Search Engines (WSEs) are another example of systems that benefit from mining their event log data (i.e., Query Logs)   Query Log Mining (QLM) has proven to be effective for enhancing the overall performances of WSEs   We propose a QLM technique for identifying search patterns (tasks) from the stream of queries recorded in query logs [LOPST11] 8
    • Overview  Introduction   Goal  Methodology  Experiments  Conclusions
    • Goal   Treat PM as an instance of the SPM problem   Detect frequent sequential patterns of service invocation, i.e., services that are frequently co-invoked within the same sequence –  e.g., service Y is usually invoked afterwards service X   Find which/how services are actually used –  service recommendation –  avoiding/handling anomalous situations 10
    • Overview  Introduction  Goal   Methodology  Experiments  Conclusions
    • Sequential Pattern Mining   Event log might be viewed as sequences of events that change with time (time-series)   We are interested in finding sequences of services that are frequently invoked in a specific order, i.e., sequential patterns   Sequential Pattern Mining (SPM) is the process of extracting sequential patterns whose support exceeds a predefined minimal support threshold min_supp 12
    • PrefixSpan   One of the most efficient algorithm for finding sequential patterns [PHMP01]   Mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation   Takes only into account the chronological order between events -  i.e., it only cares if X comes before Y without worrying about the actual time interval 13
    • MiSTA   Hint: observing that two services are invoked really close rather than far away to each other in a sequence could lead to distinct conclusions   MiSTA [GNPP06] is able to deal with the actual time interval between any two consecutive service invocations   It needs a time threshold tau for specifying the maximum time interval of events in a frequent sequence 14
    • Overview  Introduction  Goal  Methodology   Experiments  Conclusions
    • Data Set: VRESCo   VRESCo is the runtime environment for Service-oriented Computing developed by VITALab@TUW   It collects usage data (i.e., events) in the form of XML log file   VRESCo event log file contains information about: invoked services, service rebinding, service failure, etc.   We only focus on service invocation events 16
    • PrefixSpan: min_supp=25% 17
    • PrefixSpan: min_supp=50% 18
    • PrefixSpan: min_supp=66% 19
    • MiSTA: min_supp=32%, tau=5sec. 20
    • MiSTA: min_supp=32%, tau=60sec. 21
    • MiSTA: min_supp=32%, tau=300sec. 22
    • Results   The service logs coming from the VRESCo runtime environment contain frequent patterns of services;   Those patters contains information about: invoked services, service rebinding, service failure, etc;   Those patterns could be collected by considering co- occurring sequences and also by considering the time;   Such inferred knowledge can be used to enhance SBAs: e.g., by means of novel design tools like service recommendation. 23
    • Overview  Introduction  Goal  Methodology  Experiments   Conclusions
    • Conclusions   Event logs collected by complex software systems represent a huge source of information (knowledge)   Find sequences of frequently co-invoked services from SBA event logs using Sequential Pattern Mining (SPM)   2 SPM algorithms run on top of a real-world SBA event log (VRESCo): PrefixSpan, MiSTA   Experimental results show that some services are often invoked together in a frequent sequence   Exploit such inferred knowledge to enhance SBAs: e.g., by means of novel design tools like service recommendation
    • References –  [CW96] J. E. Cook and A. L. Wolf, “Discovering models of software processes from event-based data”. Research Report Technical Report CUCS-819-96, Computer Science Dept., Univ. of Colorado, 1996. –  [AGL98] R. Agrawal, D. Gunopulos, and F. Leymann, “Mining Process Models from Workflow Logs”. In Sixth International Conference on Extending Database Technology, pp. 469–483, 1998 –  [vdAWM04] W. van der Aalst, T. Weijters, and L. Maruster, “Workflow Mining: Discovering Process Models from Event Logs”. IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 9, pp. 1128–1142, Sep. 2004. –  [LOPST11] C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei, “Identifying task-based sessions in search engine query logs”, in WSDM ’11. ACM, 2011, pp. 277–286. –  [PHMP01] J. Pei, J. Han, B. Mortazavi-Asl, and H. Pinto, “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth,” in ICDE ’01. IEEE, 2001 –  [GNPP06] F. Giannotti, M. Nanni, D. Pedreschi, and F. Pinelli, “Mining sequences with temporal annotations,” in SAC ’06. ACM, 2006, pp. 593–597.