Research Article


Published on

Research Article:A Suggested Model based on the open standards in Data Warehouse presented at 12th NRC.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Research Article

  1. 1. Advanced Databases Presentation on “ A Suggested Model based on the open standards in Data Warehouse ” Presented by: Shamama Tul Umber Parwaiz (0772122)
  2. 2. Contents <ul><li>Abstract </li></ul><ul><li>Introduction </li></ul><ul><li>Literature Review </li></ul><ul><li>Research Methodology </li></ul><ul><li>Conclusion </li></ul><ul><li>Future Work </li></ul><ul><li>Acknowledgement </li></ul><ul><li>References </li></ul>
  3. 3. Abstract <ul><li>The current data warehouse technology is not based on the open standards; there exist several proprietary standards, but the unified agreed upon standards for data warehousing are still lacking, this is a driving force for certain issues like security, interoperability and integration etc. in this research we have presented a model that describes the core layers in the data warehouses that are supposed to be based on the open standards and discusses some of those in detail. </li></ul>
  4. 4. Introduction <ul><li>In recent years the Data Warehousing has become a very useful technology for integrating the operational data sources in a way that it gives the decision making capabilities to the top level management of the organization. While designing and developing the data warehouse the IT experts have to choose an appropriate approach, but unluckily the approaches are not based on open standards. </li></ul>Continued on next slide …
  5. 5. Introduction Contd… <ul><li>The data warehouse technology is facing many challenges related to Design, Security, Performance, Data Cleaning, Storage, Integration, Extraction, Transformation, Loading, Data Refreshing, Schemas, Rollup, Drill down, and Interoperability, after presenting the framework that gives the meta-requirements for the data warehouse design, now we are moving towards a design model for data warehouses, this model will help in designing good data warehouses. </li></ul>Continued on next slide …
  6. 6. Introduction Contd… <ul><li>This research is primarily focused to introduce a data warehouse design model that is based on open standards and meets the meta-requirements that we have presented in our previous work, developing a data warehouses based on open standards will lead to a harmony and compatibility of several industry products. </li></ul>Continued on next slide …
  7. 7. Introduction Contd… <ul><li>Data warehouse is not just a collection of data from several operational data sources; rather we can consider the data warehouse as a defined process containing three major steps: </li></ul><ul><ul><li>Extract data from the distributed operational sources, most of the times it is extracted from the legacy systems. </li></ul></ul><ul><ul><li>Transforming and aggregating data consistently into warehouse </li></ul></ul><ul><ul><li>Accessing the data in an efficient and flexible manner </li></ul></ul><ul><li>The main contribution of the data warehouse is its power to convert the data into information that can be used in strategic decision making among the organizations [4]. </li></ul>
  8. 8. Literature Review <ul><li>Lack of standards </li></ul><ul><ul><li>There is lack of standards between industry and researchers as the have not yet agreed on a unified standard, more over no standards for modeling data warehouse security exist as yet, the design of data warehouse is not mining aware, the data warehouse design generally fulfills the OLAP requirements but do not address the Data Mining requirements [1]. </li></ul></ul>Continued on next slide …
  9. 9. Literature Review Contd… <ul><li>Lack of standards </li></ul><ul><ul><li>There is lack of standards between industry and researchers as the have not yet agreed on a unified standard, more over no standards for modeling data warehouse security exist as yet, the design of data warehouse is not mining aware, the data warehouse design generally fulfills the OLAP requirements but do not address the Data Mining requirements [1]. </li></ul></ul>Continued on next slide …
  10. 10. Literature Review Contd… <ul><li>Integration problem </li></ul><ul><ul><li>Integration issues related to data warehouses have also got the vital importance, some organizations are settling data marts which are departmental subsets focused on selected subjects e.g., a marketing data mart may include customer, product, and sales information, these data marts enable faster roll out, since they do not require enterprise-wide processing, but they lead to complex integration problems in the long run [2]. </li></ul></ul>Continued on next slide …
  11. 11. Literature Review Contd… <ul><li>Security Issues in data warehouse </li></ul><ul><ul><li>Security issues in data warehousing have also got vital importance, data from different systems having different security policies is integrated, the users of the operational systems are not the same as the users of the data warehouse, Access control schemes of Operational database objects (e.g., tables) cannot be mapped easily to Data warehouse items like dimensions, hierarchies etc. therefore need for proper OLAP security design arises [3]. </li></ul></ul>
  12. 12. Methodology <ul><li>The ETL Filter </li></ul><ul><ul><li>We have proposed the ETL Filter in our design model, the functionality of this filter would be to only allow the data with agreed upon data types to the data warehouse, in order to do this task the ETL Filter will have a repository of standards that will be populated with the data types. </li></ul></ul>Continued on next slide …
  13. 13. Methodology Contd… <ul><li>The ETL Filter </li></ul>Continued on next slide …
  14. 14. Methodology Contd… <ul><li>Platform Independent APIs </li></ul><ul><ul><li>As per our research we have to come to a conclusion that there is a strong need to develop APIs for accessing data from a warehouse, these APIs should be platform independent so that any of the programming language can be used to connect the data warehouse, these APIs also help us in preventing the changes to applications if the underlying data warehouse is changed and vice versa. </li></ul></ul>Continued on next slide …
  15. 15. Methodology Contd… <ul><li>Platform Independent APIs </li></ul>Continued on next slide …
  16. 16. Methodology Contd… <ul><li>Dimensional Security Management </li></ul><ul><ul><li>Our approach for security is relatively simple, we have introduces a layer known as “Dimensional Security Management Layer”, this layer manages the dimension level security, there could be several dimensions in a data warehouse data e.g., sales, cities, profit, products etc. the users will be only allowed to query the dimensions for which have been permitted by the dimensional security management layer. </li></ul></ul>Continued on next slide …
  17. 17. Methodology Contd… <ul><li>Dimensional Security Management </li></ul>Continued on next slide …
  18. 18. Methodology Contd… <ul><li>Dimensional Security Management </li></ul><ul><ul><li>In the initial stages the data warehouses were used and queried by executive management and business analysts only. But now-a-days the range of users with data warehouse access is increasing; the supposition that only limited users will access the data warehouse is no longer appropriate and the need of proper security and access control mechanisms is becoming more and more important. Data warehouses have become open systems, especially OLAP analysis requires this open nature [3]. </li></ul></ul>Continued on next slide …
  19. 19. [6]
  20. 20. [6]
  21. 21. [6]
  22. 22.
  23. 23. Methodology Contd… <ul><li>Dimensional Security Management </li></ul><ul><ul><li>The table – 1 is proposed to store the security information in a very simple way, the table contains two parts first part contains the header information, and the second contains the security information. </li></ul></ul><ul><ul><li>The header information section contains the “Attribute and Value” pairs that contain attributes like OLAP Server name, version etc. </li></ul></ul>Continued on next slide …
  24. 24. Methodology Contd… <ul><li>Dimensional Security Management </li></ul><ul><ul><li>The security information section of the table contains the list of users in the rows and the dimensions in the columns, the intersection between the rows and columns that are basically the ‘cells’ contain the access rights of the specific user over the particular dimension. </li></ul></ul><ul><ul><li>If a particular user wants information regarding the sale of a particular product in different cities over specified time period, then he must have the access rights for the three dimensions, i.e., product, city and time, if he does not possess the access rights to any of them he will not be able to view the specified report. </li></ul></ul>Continued on next slide …
  25. 25. Methodology Contd… <ul><li>Dimensional Security Management </li></ul><ul><ul><li>The same table can be used to define the roles, which simplify the access control management; a role is defined once and can be assigned to multiple users, and at the same time one user may possess multiple roles. </li></ul></ul><ul><ul><li>Now we describe an algorithm that will determine the access of a particular user over a dimension set. The algorithm takes the User and Dimension list as input and returns the access rights to perform any operation on the given dimension set in form of true or false. </li></ul></ul>Continued on next slide …
  26. 26.
  27. 27.
  28. 28. Flow of Algorithm Start User: ‘u’ Dimensions: (d 1 , d 2 , …, d n ) i = Index of user in table j = Index of next dimension ‘d’ in table j <= n ? table [i][j] <> R Yes No A Return ‘Failure’ No End A Return ‘Success’
  29. 29. Methodology Contd… <ul><li>Meta-data Definition Language (MDL) </li></ul><ul><ul><li>Our study shows that the meta data in different data warehouses is stored in different ways, here we want to introduce the concept of MDL i.e., Meta data Definition Language, if every data warehouse follows this language, the integration problems can be resolved, in existing data warehouse solutions we have introduced an MDL translator layer that can work with the existing system without making much changes. </li></ul></ul>Continued on next slide …
  30. 30.
  31. 31. Conclusion <ul><li>We have described the substantial technical challenges in developing and deploying data warehouses in our research. While many commercial products and services exist, there is lack of standards at the same time; there are still several interesting areas for research in developing open standards for data warehouses. </li></ul><ul><li>After going through the literature review, we have come to point that some efforts have already been taken regarding platform independent design and their respective implementations, but not much work has been done in defining the open standards for data warehouses as there have been efforts in defining open standards for web services. </li></ul>
  32. 32. Future Work <ul><li>We have presented a big picture for designing data warehouse based on open standards that do not exist today, how ever there is need to explore and materialize each of these. </li></ul>Methodology Contd…
  33. 33. ACKNOWLEDGMENT <ul><li>Authors of this paper pay special thanks to their most respectable instructor and supervisor for this work Mr. Aslam Parvez for his contributions and guidance through out the work. </li></ul>
  34. 34. References <ul><li>[1] Stefano Rizzi, &quot;Research in Data Warehouse Modeling and Design: Dead or Alive?&quot; DOLAP’06 , November 10, 2006, Arlington, Virginia, USA </li></ul><ul><li>[2] Surajit Chaudhuri and Umeshwar Dayal, “An Overview of Data Warehousing and OLAP Technology” ACM digital Library , </li></ul><ul><li>[3] Torsten Priebe, “Towards OLAP Security Design – Survey and Research Issues” DOLAP 2000 McLean , VA, USA, ACM ISBN 1-58113-323-5 </li></ul><ul><li>[4] Fabio Rilston, Jaelson Freire “DWARF: An Approach for Requirements Definition and Management of Data Warehouse Systems” 11th IEEE International Requirements Engineering Conference 1090-705X/03, 2003. </li></ul><ul><li>[5] Emilio Soler, Juan Trujillo, “A framework for developing secure data warehouses based on MDA and QVT” 2nd International Conference on availability, reliability and security (ARES 07) 0-7695-2775-2/07, 2007 IEEE. </li></ul>
  35. 35. References <ul><li>[6] Roger Fang, Sama Tuladhar, &quot;Teaching Data Warehousing & Data Mining in a Graduate Program of Information Technology&quot; Mid-South Conference, JCSC 21, 5 (May 2006) </li></ul>
  36. 36. ? ? ? Thanks …