Ph D Progress 14 09 2008

654 views

Published on

Progress report for PhD in year 2 (2008).

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
654
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ph D Progress 14 09 2008

  1. 1. GOAL-ORIENTED AND ONTOLOGY-DRIVEN REQUIREMENT ANALYSIS METHOD FOR EXTRACTION-TRANSFORMATION-LOADING (ETL) PROCESSES IN DATA WAREHOUSE SYSTEM By AZMAN TA’A (91161) SUPERVISORS Assoc. Prof. Dr. Norita Md. Norwawi Dr. Mohd Syazwan Abdullah COLLEGE OF ART AND SCIENCES September 14, 2008
  2. 2. OUTLINE <ul><li>INTRODUCTION </li></ul><ul><li>PROBLEM STATEMENT </li></ul><ul><li>REQUIREMENT ANALYSIS FOR DW </li></ul><ul><li>REQUIREMENT ANALYSIS FOR ETL PROCESSES </li></ul><ul><li>REQUIREMENT ANALYSIS APPROACH </li></ul><ul><li>TRANSFORMATIONAL ANALYSIS </li></ul><ul><li>ONTOLOGY MODELING </li></ul><ul><li>EXAMPLE: CASE STUDY AT UUM </li></ul><ul><li>CONCLUSION </li></ul>
  3. 3. INTRODUCTION <ul><li>ETL consumes 70 – 80 % of DW development resources </li></ul><ul><li>Success of ETL is depending on data integration and transformation that deal with semantic reconciliation </li></ul>
  4. 4. INTRODUCTION <ul><li>Problem in design and development of ETL processes: </li></ul><ul><ul><li>Complexity and hugeness of DW (Vassiliadis, 2000; Kimball & Caserta, 2004) </li></ul></ul><ul><ul><li>Inefficiency of data loading (Chaudhuri & Dayal, 1997; Kimball & Ross, 2002) </li></ul></ul><ul><ul><li>Data integration and transformation process (Rahm & Do, 2000; Halevy, 2005) </li></ul></ul><ul><ul><li>Generating the data transformation mechanism (Kimball & Caserta, 2004; Alexiev et al., 2005) </li></ul></ul>
  5. 5. PROBLEM STATEMENT <ul><li>Problems in ETL processes can be classified into: </li></ul><ul><ul><li>1) define and maintain the ETL specifications </li></ul></ul><ul><ul><li>2) handle semantic heterogeneity problems </li></ul></ul><ul><li>ETL specification is an application-driven rather than data-driven – cause error-prone to maintain and platform dependent fashion </li></ul><ul><li>Semantic heterogeneity is the problems of data conflicts in integration and transformation </li></ul><ul><li>However, the design of DW system should be based on proper requirement engineering process (Kimball & Caserta, 2004; Simitsis, 2004; Lujan-Mora, 2005; Rizzi et al., 2006) </li></ul>
  6. 6. PROBLEM STATEMENT <ul><li>Several efforts on ETL modeling and data integration approach have been suggested. </li></ul><ul><li>However, these efforts not focus in resolving the heterogeneity problems at modeling level, which related to the user requirements. </li></ul><ul><li>User requirements does not properly guiding the developer to design the ETL processes due to unmanageable the user interpretation in business requirements </li></ul><ul><li>The ETL designer need a proper method to design the ETL processes with consideration of data heterogeneity problems, and generation of ETL specifications from the early phase of DW development </li></ul>
  7. 7. REQUIREMENT ANALYSIS FOR DW Requirement Perspective of DW
  8. 8. REQUIREMENT ANALYSIS FOR DW <ul><li>DW requirements aim to identify the decisional information </li></ul><ul><li>Generally, DW requirements approach: </li></ul><ul><ul><li>1) process-driven (Kimball, 1996) </li></ul></ul><ul><ul><li>2) supply/data-driven (Inmon, 2002) </li></ul></ul><ul><ul><li>3) demand/user-driven (Winter & Strauch, 2003) </li></ul></ul><ul><li>DW requirements focus only on the data or information-centric – both supply and demand are relevant to adopt </li></ul><ul><li>Moreover, demand/supply-driven compliment each other to support complex requirements </li></ul>
  9. 9. REQUIREMENT ANALYSIS FOR ETL PROCESSES <ul><li>Transformation of informal statements of user requirements into formal expression of ETL specification. </li></ul><ul><li>User requirements elicited and analyzed from the organization and decision-maker perspective. </li></ul><ul><li>These requirements will be mapped with the available data sources through ETL processes, which is should be derived from transformation analysis </li></ul><ul><li>Thus, proper and systematic transformation analysis is required in the early phase of DW development </li></ul>
  10. 10. REQUIREMENT ANALYSIS FOR ETL PROCESSES General Requirement Analysis Model (Adapted from Prakash and Gosain (2003))
  11. 11. REQUIREMENT ANALYSIS FOR ETL PROCESSES <ul><li>Requirement analysis approach centered on the organizational and decisional modeling, and focus on transformation analysis. </li></ul><ul><li>Organization modeling – to identify goal that related to DW components such as facts and attributes of organization – as is analysis </li></ul><ul><li>Decision modeling – to identify goal that related to decision maker to DW components such as facts, dimension, measures, and transformation – to be analysis </li></ul><ul><li>Requirement analysis method is based on Goal-Oriented methodology </li></ul>
  12. 12. REQUIREMENT ANALYSIS FOR ETL PROCESSES Detail Requirement Analysis Model (Adapted from Giorgini et al . (2008))
  13. 13. REQUIREMENT ANALYSIS APPROACH <ul><li>The approach using Tropos methodology, which based on i* conceptual framework of software development (Yu, 1995; Bresciani et al ., 2003) </li></ul><ul><li>Founded on the agent-oriented software development methodogy, which using agent and related mentalistic notion in all phases of software development </li></ul><ul><li>Importantly, Tropos support an early requirement analysis to the implementation, which essentially explain how the intended system (i.e. DW system) will meet organization goals. </li></ul>
  14. 14. REQUIREMENT ANALYSIS APPROACH <ul><li>The Tropos methodology is consists of five main phases: early requirements, late requirements, architectural design, detailed design, and implementation. </li></ul><ul><li>Introduce the concepts of Actor, Goal, Plan, Resource, Dependency, Capability, Belief. </li></ul><ul><li>Modeling activities are Actor modeling, Dependency modeling, Goal modeling, Plan modeling, Capability modeling </li></ul><ul><li>Apply three basic reasoning techniques: means-end analysis, contribution analysis, and AND/OR decomposition </li></ul>
  15. 15. TRANSFORMATIONAL ANALYSIS <ul><li>Is not supported in previous requirement analysis approach. </li></ul><ul><li>However, the analysis should perform from an early phase of requirement engineering to ensure meeting the organization goal. </li></ul><ul><li>In Tropos, transformation analysis deal with Plan modeling , which is compliment to the decision-goal modeling. </li></ul><ul><li>Plan modeling will determine set of transformations and constraint activities as required. </li></ul>
  16. 16. ONTOLOGY MODELING <ul><li>Ontology approach is used to model two sources of information prior to the conceptual design of ETL processes. </li></ul><ul><li>First source: modeling the glossaries of DW terms (i.e. facts, attributes, dimensions, measures, actions, constraints) produced by requirement analysis process </li></ul><ul><li>Second source: modeling the data sources that related to subject area or application of DW system. </li></ul><ul><li>The tasks of ontology construction, mapping, ETL specification construction will establish before and during the conceptual design of ETL processes. </li></ul>
  17. 17. EXAMPLE: CASE STUDY IN UUM University Goals
  18. 18. EXAMPLE: CASE STUDY IN UUM Actor Diagram for UUM example
  19. 19. EXAMPLE: CASE STUDY IN UUM Rationale Diagram for UUM actor from organizational perspective
  20. 20. EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for UUM actor from organizational perspective
  21. 21. EXAMPLE: CASE STUDY IN UUM Rationale Diagram for AAD Director from decisional perspective
  22. 22. EXAMPLE: CASE STUDY IN UUM Extended Rationale Diagram for AAD Director
  23. 23. CONCEPTUAL MODEL OF ETL PROCESSES
  24. 24. LOGICAL MODEL OF ETL PROCESSES
  25. 25. CONCLUSION <ul><li>The aim of this study is to model and design the ETL processes in DW system from requirement analysis tasks to ETL processes implementation. </li></ul><ul><li>The use of Goal-Oriented approach to analyst user requirements and Ontology-based approach to model requirement glossaries and data sources hoping to resolve these problems mentioned. </li></ul><ul><li>Works still in progress - concentrate on UUM case study to pre-confirm the model and methods proposed. Next step is to implement the solutions in different domain and bigger scale. </li></ul>
  26. 26. PAPERS <ul><li>SAS Forum Malaysia – Modeling BI in Academic Information Portal, SAS Kuala Lumpur, 5 S eptember 2007. </li></ul><ul><li>SUGI08 – Academic Business Intelligence Development Using SAS Tools , San Antonio, Texas, USA. 15-19 March 2008 – Paper accepted and selected as SAS Student Ambassador. </li></ul><ul><li>Camp08 – Ontology-Based Extraction-Transformation-Loading (ETL) Processes Model in Data Warehouse Environments, Kuala Lumpur. 18 March, 2008 – Paper submitted. </li></ul>
  27. 27. THANK YOU Q & A <ul><li>[email_address] </li></ul>

×