• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Metadata in Business Intelligence
 

Metadata in Business Intelligence

on

  • 1,515 views

This presentation is part of my work for the course 'Heterogeneous and Distributed Information Systems' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master ...

This presentation is part of my work for the course 'Heterogeneous and Distributed Information Systems' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master programme.

Statistics

Views

Total Views
1,515
Views on SlideShare
1,216
Embed Views
299

Actions

Likes
3
Downloads
43
Comments
0

5 Embeds 299

http://www.scoop.it 286
https://twitter.com 5
http://www.linkedin.com 3
https://hootsuite.scoop.it 3
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Metadata in Business Intelligence Metadata in Business Intelligence Presentation Transcript

    • Metadata in Business Intelligence Jose Luis Lopez Pino Database Systems and Information Management Technische Universit¨t Berlin a January 28, 2014 v1.2
    • Table of Contents 1 Metadata What is Metadata? Metadata for Information Systems 2 Business Intelligence What is Business Intelligence? Business Intelligence in a Nutshell The Dimensional Fact Model Data Warehousing 3 Metadata in BI Motivation Classification The Four Commandments of BI Metadata 4 Examples ROLAP and Metadata Oracle Administration Tool 5 Research Metadata and Interoperability Platform-Independent Models Metadata in Multiversion DWH 6 Big Data Examples Some Thoughts about Metadata and Hadoop 7 Conclusions 10 Reasons why Metadata matters in BI Final Conclusions
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata Jose Luis Lopez Pino 3
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions What is Metadata? “ Metadata is a set of data that describes and gives information about other data. ” — Oxford Dictionary “ Metadata is explicitly managed data describing other data or system elements to support their documentation, reusability and interoperation.” 1 1 Susanne Busse, Ralf-Detlef Kutsche, Ulf Leser, and Herbert Weber. Federated information systems: Concepts, terminology and architectures. Citeseer, 1999 Jose Luis Lopez Pino 4
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata for Information Systems Technical metadata: describes information regarding the technical access mechanisms of components. Logical metadata: relates to the schemas and their logical relationships. Metamodels: supports the interoperability of schemas in different data models. Semantic metadata: helps to describe the semantic of concepts. Quality-related: describes source-specific properties of information systems regarding their quality. Infrastructure metadata: helps users to find relevant data. User-related metadata: describes responsibilities and preferences of the users Jose Luis Lopez Pino 5
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Business Intelligence Jose Luis Lopez Pino 6
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions What is Business Intelligence? Processing and organizing data in order to extract information and using this information to make business decisions. “ Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.” — Gartner Jose Luis Lopez Pino 7
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Why Data Analysis? Jose Luis Lopez Pino 8
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Business Intelligence in a Nutshell I OLTP: information system oriented to small and interactive operations ETL: process that consist of extractions, transformations and loads of data Data warehouse: central repository of data used for reporting and analysis Datamart: contains a subset of the information of a data warehouse and it is personalized for a single business view. OLAP: technique to analyse multi-dimensional data ROLAP: using a relational database do OLAP analysis MDX: query language for multidimensional data Data mining: discovering patterns in data Jose Luis Lopez Pino 9
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Business Intelligence in a Nutshell II Data visualization: representation of data to make it more meaningful and/or attractive Decision support: tools that facilitates making a decision based on data Data-driven business: companies leaded by a strategy based on data Jose Luis Lopez Pino 10
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions The Dimensional Fact Model I Fact: is an event that is relevant to the decision-making process. Measure: is a numerical attribute of the fact The dimensions categorize the data into a finite number of slots. Jose Luis Lopez Pino 11
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions The Dimensional Fact Model II Jose Luis Lopez Pino 12
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Cube Jose Luis Lopez Pino 13
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Data Warehousing Copyright 2013 Toon Calders http://goo.gl/ds8nZc Jose Luis Lopez Pino 14
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata Management in Data Warehousing Copyright 2014 LINGARO http://goo.gl/Wfxsni Jose Luis Lopez Pino 15
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata in BI Jose Luis Lopez Pino 16
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Motivation: Quotes I “Metadata is a vital element of the data warehouse.” — William Inmon2 “Metadata is the DNA of the data warehouse.” — Ralph Kimball3 “Metadata is analogous to the data warehouse encyclopedia.” — Ralph Kimball3 2 William H Inmon. Metadata in the Data Warehouse. Morgan Kaufmann, 2000 3 Ralph Kimball. The data warehouse lifecycle toolkit: expert methods for designing, developing, and deploying data warehouses. Wiley. com, 1998 Jose Luis Lopez Pino 17
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Motivation: Quotes II “The fact that metadata drives the warehouse is the literal truth. If you think you wont use metadata, you are mistaken.” — Ralph Kimball4 “In the scope of data warehousing, meta-data plays an essential role because it specifies source, values, usage and features of data warehouse data and defines how data can be changed and processed at every architecture layer.” — Matteo Golfarelli, Stefano Rizzi5 4 Ralph Kimball. The data warehouse lifecycle toolkit: expert methods for designing, developing, and deploying data warehouses. Wiley. com, 1998 5 M. Golfarelli and S. Rizzi. Data Warehouse Design: Modern Principles and Methodologies. Mcgraw-Hill, 2009 Jose Luis Lopez Pino 18
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata is everywhere! Meaning of the objects. User profiles. Security permissions. Usage statistics. Logical model. Relation between physical and logical objects. DBMS metadata: tables, indexes, FKs, PKs, etc. Reporting / Data analysis objects. Transformations of the data. Data sources and data targets. Query logs. ETL logs. Materialized information. Jose Luis Lopez Pino 19
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Classification 1. Technical metadata: Describes the physical objects that make up the datata warehouse. Tables, fields, indexes, sources, targets, transformations, etc. 2. Business metadata: Describes the contents of the data warehouse in an accessible way to conduct the day-to-day business.6 Facts, dimensions, logical relationships, etc. 3. Process metadata: Describes operations executed on the warehouse and their results. Results of the ETL process, query logging, etc. 6 William H Inmon, Bonnie O’Neil, and Lowell Fryman. Business Metadata: Capturing Enterprise Knowledge: Capturing Enterprise Knowledge. Morgan Kaufmann, 2010 Jose Luis Lopez Pino 20
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions The Four Commandments of BI Metadata A data warehouses likelihood for success is greatly increased by following Ralph Kimball advices:7 1. Be aware of what metadata you keep. 2. Centralize it where possible. 3. Track your metadata. 4. Keep it up to date. 7 Ralph Kimball. The data warehouse lifecycle toolkit: expert methods for designing, developing, and deploying data warehouses. Wiley. com, 1998 Jose Luis Lopez Pino 21
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Examples Jose Luis Lopez Pino 22
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions ROLAP and Metadata Figure: PostgreSQL’s ROLAP server translates MDX query into SQL Jose Luis Lopez Pino 23
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions ROLAP and Metadata SELECT E x p e n s e s . ” E x p e n s e s p e r day ” saw 0 , E x p e n s e s . ” Days w i t h e x p e n s e s ” saw 1 , E x p e n s e s . ” T o t a l E x p e n s e s ” saw 2 , P e r i o d . ” Year ” saw 3 FROM ”HR − T r a v e l E x p e n s e s ” ORDER BY saw 3 1 2 3 4 5 6 7 Figure: MDX Query Jose Luis Lopez Pino 24
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions ROLAP and Metadata select sum( case when T1757 . ZD NUM = 0 then 0 e l s e ( T1757 . ZMDTACE NAC IM + 1 7 5 7 .ZMDTACO NAC IM + T1757 . ZD NAC IM + T1757 . ZCOMD NAC IM + T1757 . ZCOMDDIC IM + T1757 . ZMDTACE EXT IM + T1757 . ZMDTACO EXT IM + T1757 . ZD EXT IM + T1757 . ZCOMD EXT IM) / n u l l i f ( T1757 . ZD NUM, 0 ) end ) as c1 , sum( T1757 . ZD NUM) as c2 , sum( T1757 . ZCLV 032 + T1757 . ZCLV 132 ) as c3 , T623 .YEAR as c4 from SYSADM. PS ZOBI CALENDA VW T623 , SYSADM. PS ZOBI DS TBL T1757 where ( T623 . MONTH OF YEAR = T1757 . MONTH OF YEAR and T1757 . ZID COL = ’T ’ and T623 . MONTH OF YEAR <= 201206 and T623 . YEAR between 2012 − 2 and 2012 ) group by T623 .YEAR o r d e r by c4 1 2 3 4 5 6 7 8 9 10 11 Figure: SQL Query Jose Luis Lopez Pino 25
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Oracle Administration Tool Figure: The physical layer stores the tehnical metadata meanwhile the other two layers store the business metadata. Jose Luis Lopez Pino 26
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Advantages Abstraction: the data analysts do not need to have knowledge of the complex data sources involved in the system. Data analysts only worry about the business question, not about how to answer it. Portability: the changes on the physical model don’t affect the logical model. Security: defining a strong security policy allow the administrators to restrict the access of the users to information that they must not know about. Customization: the information is adapted to the user. Azriel Marla and Bob Ertl. Oracle fusion middleware metadata repository builder’s guide for oracle business intelligence enterprise edition, 11g release 1 (11.1. 1), 2011 Jose Luis Lopez Pino 27
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Research Jose Luis Lopez Pino 28
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata and Interoperability The BI environment is compound of a wide variety of tools Complex bridges are crucial to integrate metadata among them. It is necessary to define a standard to facilitate the interoperability and integration. Some attempts: Open Information Model (OIM) by Meta Data Coalition. Common Warehouse Metamodel (CWM) by OMG. OIM was integrated to CWM. Suggestion: to use domain ontologies to establish semantic mappings between different data-marts Stefano Rizzi, Alberto Abell´, Jens Lechtenb¨rger, and Juan Trujillo. o o Research in data warehouse modeling and design: dead or alive? In Proceedings of the 9th ACM international workshop on Data warehousing and OLAP, pages 3–10. ACM, 2006 Jose Luis Lopez Pino 29
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions How Standards proliferate? Figure: XKCD http://xkcd.com/927/ Jose Luis Lopez Pino 30
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions OIM Vs. CWD They both are metadata standards for data warehousing OIM’s scope is wider, not only for metadata. Good for technical metadata, not for business metadata. OIM is limited to relational data. Using CWM, metadata exchange between tools that use the XMI standard is automatic. Thomas Vetterli, Anca Vaduva, and Martin Staudt. Metadata standards for data warehousing: open information model vs. common warehouse metadata. ACM Sigmod Record, 29(3):68–75, 2000 Jose Luis Lopez Pino 31
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Platform-Independent Models The problem: You have to provide OLAP metadata to bridge the gap between the conceptual and logical model. This metadata depends on the platform. The solution: Define an OLAP algebra that provides semantics in multidimensional models. It derives the logical design automatically, for any platform. Model Driven Architecture: derive the metadata from the conceptual model. Jes´s Pardillo, Jose-Norberto Maz´n, and Juan Trujillo. Bridging the u o semantic gap in olap models: platform-independent queries. In Proceedings of the ACM 11th international workshop on Data warehousing and OLAP, pages 89–96. ACM, 2008 Jes´s Pardillo, Jose-Norberto Maz´n, and Juan Trujillo. Towards the u o automatic generation of analytical end-user tools metadata for data warehouses. In Sharing Data, Information and Knowledge, pages 203–206. Springer, 2008 Jose Luis Lopez Pino 32
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Metadata in Multiversion DWH Multiversion DWH: It keeps track of the changes in the schema and the data. Metadata become more complex and useful in these systems. Proposal: Use a metamodel to manage different versions of the DWH. Use a metamodel to detect changes in the external data sources. Robert Wrembel and Bartosz Bebel. Metadata management in a multiversion data warehouse. In On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, pages 1347–1364. Springer, 2005 Jose Luis Lopez Pino 33
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Big Data Jose Luis Lopez Pino 34
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Examples: HDFS The NameNode stores all the metadata in a single point. It keeps all the metadata in memory. It might be problematic when we store a vast amount of small files14 14 Grant Mackey, Saba Sehrish, and Jun Wang. Improving metadata management for small files in hdfs. In Cluster Computing and Workshops, 2009. CLUSTER’09. IEEE International Conference on, pages 1–4. IEEE, 2009 Jose Luis Lopez Pino 35
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Examples: Query Planner Figure: Apache Drill architecture: http://goo.gl/icZctF Jose Luis Lopez Pino 36
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Examples: Table and Storage Management Layer Figure: HCatalog http://goo.gl/7E1xLc Jose Luis Lopez Pino 37
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Examples: Authorization to Data and Metadata Figure: Apache Sentry: http://goo.gl/zAsIyk Jose Luis Lopez Pino 38
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Some Thoughts about Metadata and Hadoop Technical metadata is necessary. Hadoop is rapidly becoming a mature platform and hence metadata will be more relevant in the following years. Metadata seems to be a perfect fit for the heterogeneous Hadoop ecosystem. Jose Luis Lopez Pino 39
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Conclusions Jose Luis Lopez Pino 40
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions 10 Reasons why Metadata matters in BI 1. It’s everywhere! 2. It meets the disparate needs of the data warehouses technical, administrative, and business user groups. 3. It contains information at least as valuable as regular data. 4. It is used to describe the semantic of concepts. 5. It facilitates the extraction, transformation and load process. 6. It improves data security. 7. It hides implementation details. 8. We can customize how the user sees the data. 9. It helps interoperability among systems. 10. It allow us to design portable solutions. Jose Luis Lopez Pino 41
    • Metadata Business Intelligence Metadata in BI Examples Research Big Data Conclusions Final Conclusions 1. Metadata matters 2. Metadata is everywhere. You can’t get out of dodge 3. Research is alive 4. Metadata management is less painful when using the right tools 5. Big data challenges are eased by metadata Jose Luis Lopez Pino 42