LOGO
www.sp2.fr           http://www.polytech.univ-nantes.fr/COD/




                                  An Ontology-Based
...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                                                   Introduction

             Decision Support Systems

            ...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                                 Problematic - Industrial

             Enterprises’ decision support systems – at t...
LOGO

                              Problematic – Industrial

             High costs of data warehouse maintenance (due t...
LOGO

                                  Problematic – Industrial

             Example
               10 Data warehouses a...
LOGO

                               Problematic – Scientific

             How to manage efficiently decision support sys...
LOGO

                              Problematic – Scientific

             Building knowledge bases based on decision supp...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                            Knowledge Management



             Manage data    warehouse   for   improving        i...
LOGO

Data Warehouse
                               Knowledge Management
Performance
             The measure of performan...
LOGO

Data Warehouse
                               Knowledge Management
Performance
             Several propositions    ...
LOGO

Knowledge Division
                                Knowledge Management

             Our proposition for dividing k...
LOGO

Knowledge Division
                                Knowledge Management

             Architectural information
    ...
LOGO

Knowledge Division
                                Knowledge Management

             Configuration and performance ...
LOGO

Knowledge Division
                                Knowledge Management

             Experience and best practices
...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                                   Autonomic Computing

             Previous propositions of representing self mana...
LOGO

                                  Autonomic Computing

             Autonomic computing - the ability for an IT
    ...
LOGO

Autonomic Computing
                                Autonomic Computing
Manager
             Autonomic Computing Man...
LOGO

Autonomic Computing
                                   Autonomic Computing
Manager
             We propose the imple...
LOGO

Autonomic Computing
                               Autonomic Computing
Manager
             Retaking the Decision Su...
LOGO

Algorithms
                                    Autonomic Computing
Self-Improvement
             Self-improvement al...
LOGO

Algorithms
                   Autonomic Computing
Self-Improvement




22/09/2009             25                Vlad...
LOGO

Group-Improvement
                                   Autonomic Computing
Algorithm

             Group improvement a...
LOGO

Group-Improvement
                                Autonomic Computing
Algorithm
             Performance data wareho...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                             Combining the elements

             Bringing the Knowledge Management, Autonomic
     ...
LOGO

Knowledge base
                                  Combining the elements

             Ontology: explicit formal spec...
LOGO

Knowledge base
                                     Combining the elements

             OWL:
                W3C re...
LOGO

Knowledge base
                               Combining the elements

             Used to formalize the first two t...
LOGO

Autonomic Computing
                                 Combining the elements

             The dynamic part of the kn...
LOGO

Autonomic Computing
                             Combining the elements

             Autonomic Computing Manager lo...
LOGO

Algorithms
                              Combining the elements

             Described using Jena Ontology based ru...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

                                                       Results

             Scenario:
               With Oracle Hy...
LOGO

                  Results




22/09/2009   38             Vlad Nicolicin Georgescu
LOGO

                                                       Results

             At the end of day 5 we have a good rati...
LOGO
                                Contents

             1   Introduction

             2   Problematic

             3...
LOGO

Conclusions
                     Conclusions & Future Directions

             We have presented a common problemati...
LOGO

Future directions
                     Conclusions & Future Directions

             Extension of the parameters use...
LOGO




             Remarks
             …
             Questions
             …
             Propositions
             ...
LOGO
                                                                    References
             Mark N. Frolick and Keith...
Upcoming SlideShare
Loading in …5
×

An Ontology-Based Autonomic System for Improving Data Warehouses by Cache Allocation Management

1,106 views

Published on

FGWM 2009 presentation

Published in: Technology, Education
  • Be the first to comment

An Ontology-Based Autonomic System for Improving Data Warehouses by Cache Allocation Management

  1. 1. LOGO www.sp2.fr http://www.polytech.univ-nantes.fr/COD/ An Ontology-Based Autonomic System for Improving Data Warehouses by Cache Allocation Management Vlad Nicolicin-Georgescu, Henri Briand Remi Lehn and Vincent Benatier Knowledge and Experience Management Workshop FG-WM 2009 22/09/2009
  2. 2. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  3. 3. LOGO Introduction Decision Support Systems Computerized systems with the main goal to analyze a series of facts and give propositions for acting regarding the facts involved – Business Intelligence Their core is the analytical (derived) data which is translated into data warehouse (architecture) with the help of data marts (the bricks) (Inmon, 2005) The challenge: managing the data warehouses efficiently (cost, performance and resource scaling) 22/09/2009 3 Vlad Nicolicin Georgescu
  4. 4. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  5. 5. LOGO Problematic - Industrial Enterprises’ decision support systems – at the end of the first year up to 90% of data warehouse efforts is considered as failure (Frolick and Lindsey, 2003) The main causes Bad management - manual configurations, manual maintenance operations, bad scaling of systems resources Bad performance due to inefficient common resource sharing between groups and conglomerates Increase of the data warehouse size with time Any of the data may be accessed at any time: ‘Give me what I want so I can tell you what I really want’ 22/09/2009 5 Vlad Nicolicin Georgescu
  6. 6. LOGO Problematic – Industrial High costs of data warehouse maintenance (due to previous causes) translated into: Need for increase in a systems hardware resources (normal cost) Need for decisional experts to configure and maintain data warehouses (more costly) 22/09/2009 6 Vlad Nicolicin Georgescu
  7. 7. LOGO Problematic – Industrial Example 10 Data warehouses and shared RAM memory 1 data warehouse requires 20GB of RAM -> 200GB of RAM • Costly high (sometimes not a problem) • Architecturally impossible (stuck!) How to reallocate and manage? To manage them the enterprise makes use of an expert to configure and maintain how the memory is allocated based on each data warehouse’s needs: priority, usage period, changes in the architecture etc The problem repeats recursively Too hard to sustain due to cost and human limits 22/09/2009 7 Vlad Nicolicin Georgescu
  8. 8. LOGO Problematic – Scientific How to manage efficiently decision support systems: How to formalize non structured data from different sources (editors readme, forums, html ..) How to render various processes (RAM memory allocation between groups of data warehouse) autonomic based on the formalized knowledge Finding suitable algorithms for resource allocation and parameter configuration (cache memory) in groups of data warehouse 22/09/2009 8 Vlad Nicolicin Georgescu
  9. 9. LOGO Problematic – Scientific Building knowledge bases based on decision support systems - Ontologies and Ontology Based Rules Autonomic Computing based on the knowledge bases & algorithms for improving data warehouse performance Combining the notions of knowledge formalization with the notions of autonomic computing for data warehouse management 22/09/2009 9 Vlad Nicolicin Georgescu
  10. 10. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  11. 11. LOGO Knowledge Management Manage data warehouse for improving its performances Knowledge division in the knowledge base to express a decision support system 22/09/2009 11 Vlad Nicolicin Georgescu
  12. 12. LOGO Data Warehouse Knowledge Management Performance The measure of performance: query response time for data retrieval operations Analytical data is presented as opposed to operational data by being retrieval time relaxed (Inmon, 2005) True: if the operations we speak of concern aggregation and calculation operations (i.e. during night) Not so true: when performing data retrieval tasks for rapport generation (day usage of the data warehouse) 22/09/2009 12 Vlad Nicolicin Georgescu
  13. 13. LOGO Data Warehouse Knowledge Management Performance Several propositions for query response time improvement: (Malik et al, 2008): how to design physically data bases throughout caches – data base and architecture oriented (Saharia and Babad, 2000): determining which data is most likely to be accessed so it can be stored into caches - works well for single data warehouse improvement and concerns the data requested rather than on how to modify the data warehouse parameters. 22/09/2009 13 Vlad Nicolicin Georgescu
  14. 14. LOGO Knowledge Division Knowledge Management Our proposition for dividing knowledge to represent a decision support system Three main types Architectural Configuration and performance Experience and advice/best practices 22/09/2009 14 Vlad Nicolicin Georgescu
  15. 15. LOGO Knowledge Division Knowledge Management Architectural information What components are part of a decision support systems How are these entities linked and how do they exchange What are the common resources characteristic for each entity and shared between the 22/09/2009 15 Vlad Nicolicin Georgescu
  16. 16. LOGO Knowledge Division Knowledge Management Configuration and performance indicators (for Essbase multidimensional cubes) For each of the data warehouse: index file and data file size (how much space does it occupy on the disk ) Three types of caches: index, data file and data cache Query response time on data retrieval operations 22/09/2009 16 Vlad Nicolicin Georgescu
  17. 17. LOGO Knowledge Division Knowledge Management Experience and best practices More delicate due to its subjectivity and non structured form in which the information finds itself Represents all knowledge concerning decision support system and data warehouse management (in any form) Comes from several sources Formalized under the form of rules knowledge base, such as Event Condition Rules (Huebscher et al, 2008) 22/09/2009 17 Vlad Nicolicin Georgescu
  18. 18. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  19. 19. LOGO Autonomic Computing Previous propositions of representing self managing systems: Inspired by the functioning of the human body (Wang, 2007) Self-healing systems to be further on elaborated to self-X systems (Gosh et al., 2007) Proposition made by IBM in 2001, and refined towards the current known form (IBM, 2001) 22/09/2009 19 Vlad Nicolicin Georgescu
  20. 20. LOGO Autonomic Computing Autonomic computing - the ability for an IT infrastructure to adapt and change in accordance with business policies and objectives, guiding systems to be (IBM, 2001): Self-configuring Self-healing Self-optimizing Self-protecting 22/09/2009 20 Vlad Nicolicin Georgescu
  21. 21. LOGO Autonomic Computing Autonomic Computing Manager Autonomic Computing Manager: automates the self-X functions and externalizes these functions according to the behavior defined by the management interfaces (IBM, 2001). The MAPE-K loop: 22/09/2009 21 Vlad Nicolicin Georgescu
  22. 22. LOGO Autonomic Computing Autonomic Computing Manager We propose the implementation of the loop on each of the levels from the architecture of the decision support system Each entity has its own individual loop and is related to the superior entities only Each entity’s manager has two ‘responsibilities’: Its individual self-management Its direct children management 22/09/2009 22 Vlad Nicolicin Georgescu
  23. 23. LOGO Autonomic Computing Autonomic Computing Manager Retaking the Decision Support System’s schema 22/09/2009 23 Vlad Nicolicin Georgescu
  24. 24. LOGO Algorithms Autonomic Computing Self-Improvement Self-improvement algorithm: Specific for the individual loop of each of the data warehouse Executed at the end of each day when statics over the usage of the data warehouse are gathered and its parameters can be changed Tries to improve the cache allocation for a data warehouse by repetitively decreasing the cache values up to a certain limit: • Step: the amount of cache decrease at each time period (CV – cache value) CV1 = CV0 - (CVmax –CV0)*step • Delta: the threshold at which the algorithm stops. The impact that a cache modification has. If (RT1-RT0)/RT0 < delta then we accept the new cache proposition. (RT – average query response time) 22/09/2009 24 Vlad Nicolicin Georgescu
  25. 25. LOGO Algorithms Autonomic Computing Self-Improvement 22/09/2009 25 Vlad Nicolicin Georgescu
  26. 26. LOGO Group-Improvement Autonomic Computing Algorithm Group improvement algorithm Specific for each application (seen as a group of data warehouse) Has the role of reallocating caches periodically between the data warehouses in the group depending on their average performance ‘The catch’: by a small sacrifice (delta) of some data warehouses there is important performance gain to others How to distinguish between performance and nonperformance data warehouses? 22/09/2009 26 Vlad Nicolicin Georgescu
  27. 27. LOGO Group-Improvement Autonomic Computing Algorithm Performance data warehouse: its average query response time is under the average response time of the group Non-performance data warehouse: the ones that are above (the equal can go in one of the two categories) 22/09/2009 27 Vlad Nicolicin Georgescu
  28. 28. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  29. 29. LOGO Combining the elements Bringing the Knowledge Management, Autonomic Computing and Algorithms all together Knowledge bases are formalized with the help of OWL ontologies and ontology based rules Autonomic Computing Managers are implemented with the help of ontology based rules and Java programs Algorithms are formalized by ontologies, rules and java programs 22/09/2009 29 Vlad Nicolicin Georgescu
  30. 30. LOGO Knowledge base Combining the elements Ontology: explicit formal specifications of the terms in the domain and relations among them (Grubber, 1992) It expresses: The hierarchical inclusion relations between entities (taxonomy) The inter-entity concept relations that makes it much more powerful than a taxonomy Used with several knowledge formalization approaches 22/09/2009 30 Vlad Nicolicin Georgescu
  31. 31. LOGO Knowledge base Combining the elements OWL: W3C recommendation in xml based format for ontology representation Evolved from the RDF It provides the main concepts of: Individual: an instance of ‘something’, the actual concept itself (i.e. John, Mary, Bob) Class: a group of individuals belonging to a same set having common properties (i.e. John, Mary, Bob are Human, John, Bob are Men) Property: a characteristic of an individual that makes it different form others and allows him to belong to a class • Data type property: links an individual to a literal value (John is 30 years old) • Object property: links an individual to other individuals (John is the friend of Mary, Mary hates Bob) Sentence representation: (subject, predicate, object) – (John, 22/09/2009 hasAge, 30) 31 Vlad Nicolicin Georgescu
  32. 32. LOGO Knowledge base Combining the elements Used to formalize the first two types of information: architectural and configuration/performance The ‘static’ aspect of the approach An OWL representation of a data warehouse 22/09/2009 32 Vlad Nicolicin Georgescu
  33. 33. LOGO Autonomic Computing Combining the elements The dynamic part of the knowledge management aspect The rules that formalize: The passage between the four states of the Autonomic Computing Manager How does the knowledge base in the middle of the loop connects with each state How the two algorithms are implemented over the loop We base our approach on previous works to using autonomic computing with ontologies (Stojanovic, 2004) 22/09/2009 33 Vlad Nicolicin Georgescu
  34. 34. LOGO Autonomic Computing Combining the elements Autonomic Computing Manager loop phases applied on the levels of the decision support systems 22/09/2009 34 Vlad Nicolicin Georgescu
  35. 35. LOGO Algorithms Combining the elements Described using Jena Ontology based rules Example of the data warehouse individual self-improving algorithm 22/09/2009 35 Vlad Nicolicin Georgescu
  36. 36. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  37. 37. LOGO Results Scenario: With Oracle Hyperion Essbase BI solution An Essbase application with two data warehouses (DW1 and DW2) A period of 14 days to see how each data warehouse improves and how the application relocates the memory A random series of queries (from a given pool) is done on each data warehouse each day Individual self-improvement algorithm runs each day Group reallocation algorithm runs each 4 days 22/09/2009 37 Vlad Nicolicin Georgescu
  38. 38. LOGO Results 22/09/2009 38 Vlad Nicolicin Georgescu
  39. 39. LOGO Results At the end of day 5 we have a good ratio response time/cache allocation The data warehouses improve themselves (individual algorithm) fast and then oscillate around this point (DW2) At the end of the 6th day: DW2 looses 2% in response time DW1 gains around 80% The application has reduced its memory consumption with 60%. 22/09/2009 39 Vlad Nicolicin Georgescu
  40. 40. LOGO Contents 1 Introduction 2 Problematic 3 Knowledge Management 4 Autonomic Computing 5 Combining the Elements 6 Results 7 Conclusions and Future Directions 22/09/2009 Vlad Nicolicin Georgescu
  41. 41. LOGO Conclusions Conclusions & Future Directions We have presented a common problematic in enterprises today: knowledge management in decision support systems We have presented how can we formalize data warehouses with the help of ontologies and ontology based rules data We have seen how we can enable autonomy by using Autonomic Computing We presented results over a test on a real application 22/09/2009 41 Vlad Nicolicin Georgescu
  42. 42. LOGO Future directions Conclusions & Future Directions Extension of the parameters used for data warehouse performance: calculation time, aggregation time etc. Introduction of Service License Agreement (SLA) notions for defining data warehouse usage specifications Extension of the knowledge base so it can be enriched in an autonomic way Introduction of attenuation in algorithms to avoid oscillation 22/09/2009 42 Vlad Nicolicin Georgescu
  43. 43. LOGO Remarks … Questions … Propositions … 22/09/2009 Vlad Nicolicin Georgescu
  44. 44. LOGO References Mark N. Frolick and Keith Lindsey. Critical factors for data warehouse failure. Business Intelligence Journal, Vol. 8, No. 3, 2003. Debanjan Ghosh, Raj Sharman, H. Raghav Rao, and Shambhu Upadhyaya. Self-healing systems — survey and synthesis. Decision Support Systems 42, Vol 42:p. 2164–2185, 2007 T. Gruber. What is an ontology? Academic Press Pub., 1992 M.C. Huebscher and J.A. McCann. A survey on autonomic computing – degrees, models and applications. ACM Computing Surveys, Vol. 40, No. 3, 2008 Corporation IBM. An architectural blueprint for autonomic computing. IBMCorporation, 2001 Corporation IBM. Autonomic computing. powering your business for success. International Journal of Computer Science and Network Security, Vol.7 No.10:p. 2–4, 2005 W.H. Inmon. Building the data warehouse, fourth edition. Wiley Publishing, 2005 S.S. Lightstone, G. Lohman, and D. Zilio. Toward autonomic computing with db2 universal database. ACM SIGMOD Record, Vol. 31, Issue 3, 2002 A. Mateen, B. Raza, and T. Hussain. Autonomic computing in sql server. In 7th IEEE/ACIS International Conference on Computer and Information Science, 2008 L. Stojanovic, J. Schneider, A. Maedche, S. Libischer, R. Studer, Th. Lumpp, A. Abecker, G. Breiter, and J. Dinger. The role of ontologies in autonomic computing systems. IBM Systems Journal, Vol. 43, No. 3:p. 598– 616, 2004 V. Markl, G. M. Lohman, and V. Raman. Leo : An autonomic optimizer for db2. IBM Systems Journal, Vol. 42, No. 1, 2003 A. N. Saharia and Y.M. Babad. Enhancing data warehouse performance through query caching. The DATA BASE Advances in Informatics Systems, Vol 31, No.3, 2000 Yingxu Wang, Toward Theoretical Foundations of Autonomic Computing, Int’l Journal of Cognitive Informatics and Natural Intelligence, 1(3), 1-16, July-September 2007 22/09/2009 Vlad Nicolicin Georgescu

×