BI Knowledge Sharing
Session 2
2-Session Knowledge Sharing Outline
Session 1
• What is Business Intelligence
• What is Dimension?
• What is Measure?
• Ty...
Previous Session Recap
How to get started deploying BI?
BI Deployment
• Collect Business Requirements / Needs / Drivers
• Confirm BI Project Scope
• Turn into Functional Specific...
• Conversion from Business Model
• Each Data Model has a Specific Purpose
o For Example: Generic Use or Departmental Use
•...
Business Model
• Operation Systems aim
at helping Business
Processes running
smoothly
• Operational Database
is used to st...
• MUST be occur at Relational Database
• A Relational Data Model comprises of tables, columns and
relationships
• Transact...
• Can be occur at Relational Database
• A Dimensional Data Model comprises of Cubes, Fact Tables
and Dimension Tables
• An...
• Is a Relational Database Schema for representing
Multidimensional Data
• Every Dimension Table must have Primary Key
• A...
• Extension of Star Schema
• Dimensional Table is normalized into Multiple Lookup
Tables, each representing a level in the...
• Star Schema
o Less joins required
o --> Higher Performance
• Snowflake Schema -
o Redundancy is reduced
o --> Data Optim...
• Bill Inmon is known as the Father of Data Warehousing
• He defined a model to support Single Version of Truth and
champi...
• In most cases, Ralph Kimball recommends Star Schemas are
a better solution. Although redundancy is reduced in a
normaliz...
• Inmon’s philosophy recommends to start with building a
large Centralized Enterprise-Wide Data Darehouse, followed
by sev...
Exercise
Question 1
What is the difference between Data Warehouse
and Data Mart in your mind right now?
Answer to Question 1
Data Warehouse
• By Enterprise-wise
• Can always be easily to
incorporate with Corporate
Strategy
__ ...
• Communication Language Between YOU and Database
• Abbreviation of Structured Query Language
• SQL is a standardized quer...
• DDL - Data Definition Language
• Define the Database Structure or Schema
o For Example
 CREATE
 ALTER
 DROP
 TRUNCAT...
• DML - Data Manipulation Language
• Retrieve and Manipulate data
o For Example:
 SELECT
 INSERT
 UPDATE
 DELETE
 MER...
• DCL - Data Control Language
• Control the Security and Permissions of the objects or parts
of the database(s)
o For Exam...
• INNER JOIN (With Condition)
o Returns all rows when there is a match in BOTH tables
• OUTER JOIN (With Condition)
o LEFT...
• Equi-join
o Join condition containing an equality operator
 =
• Non Equi-join
o Join condition not containing an equali...
Exercise
Background Information
Sample Tables
Note: in Customer table (predetermined "left table"), the customer "
Wong" has not be...
Question 1
• If ALL the records of the Customer table are retained even if
NO cities are assigned to him/her.
Which JOIN t...
Answer to Question 1
• LEFT JOIN keeps all the records of the left table: Cutomer table,
even if there are no cities are a...
Question 2
• If ALL the records of both Customer table and City table are
desired in one single table without duplication....
Answer to Question 2
• FULL JOIN shows ALL the records of both left and right
tables, even if lacking of matching records ...
Question 3
• Only Cutomer Records who have the Assigned City and City
Records which have the Assigned Customers are desire...
Answer to Question 3
• INNER JOIN shows only the matching records which
satisfy the predict condition in joined table.
Inn...
Question 4
• What will happen when applying CROSS JOIN to Customer
table and City table, how many records will appear in t...
Answer to Question 4
CROSS JOIN applies NO filter
conditions so it returns all the 24
(6 records in Customer table * 4 rec...
Question 5
• If ALL the records of the City table are retained even if NO
customers are assigned to this city.
Which JOIN ...
Answer to Question 5
• RIGHT JOIN keeps all the records of the right table:
City table, even if there are no customers are...
• Essential Elements of Data Modeling
o Entity-relationship - association between the tables
o Cardinality - data occurren...
Data Modeling Step-by-Step
• Step 1: Collect Business Requirements and Implement
Business Process Mapping
• Step 2: Identi...
Data Modeling Tools
• A tool which is easily for Data Architect or Data Modeler to
build the Data Model in their Computers...
• Divided into 4 Physical/Logical Partitions in Database Server Instance
• ODS - Operational Data Store
• DW - Data Wareho...
• Contains the Snapshot of the operational system
• Integration of data from different data sources
• Data inputs from ope...
• Designed in Star Schema or Snowflake Schema
• All the data are extracted from ODS
• Data are transformed according to bu...
• Storing the data from the sources other than the operating
system (E.g. Excel, CSV)
• Storage Area between ODS and DW
STG
• Storing Variables or Parameters that can be used in whole
Data Warehouse
• For Example
o Selected Date
o Is Full Load
CT
Business Model to Data Model
Business Model to Data Model (Cont’d)
Transaction Detail
Store ID Trans. Date Trans Ref. Product No. Product Name Price
30...
Physical Data Model
PK/FK Shop ID VARCHAR(4) NOT NULL
Shop Name VARCHAR(50) NOT NULL
Shop Dimension
PK/FK Date DATE NOT NU...
Exercise
Exercise - Physical Data Modeling
• Please prepare a physical data model from the given receipt
or invoice.
Question 1
Answer to Question 1
Answer to Question 1
1..n
1..1
1..1
1..1 1..1
1..1
1..1
1..1
1..11..1
1..1
1..1 1..11..1
1..n
1..n
1..n 1..n
1..n
1..n
1.....
MDX
• Multi-Dimensional Expression
• Get the Intersection Point between Column and Row
• Achieve Time Period Analysis easi...
MDX Functions (Extracts)
Previous Month / Last Year Same Period
• parallelPeriod ( level [ , integer_expression [ , member...
Do you understand the below MDX?
Popular BI Tools in the Market
Business Intelligence Tool Vendor
IBMCognos BI IBM
Microstrategy Microstrategy
Pentaho BI s...
Remember to choose the Best Business Partner
instead of
Software Vendors
BI Knowledge Sharing Session 2
Upcoming SlideShare
Loading in …5
×

BI Knowledge Sharing Session 2

723 views

Published on

BI (Business Intelligence) Knowledge Sharing Session 2

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
723
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

BI Knowledge Sharing Session 2

  1. 1. BI Knowledge Sharing Session 2
  2. 2. 2-Session Knowledge Sharing Outline Session 1 • What is Business Intelligence • What is Dimension? • What is Measure? • Type of Dimension o Degenerate Dimension o Role-Playing Dimension o Slowly Changing Dimension  Type 1  Type 2  Type 3 • Database Structure o Tables o Columns o Data Types o Constraints o Keys Session 2 • Data Model o Relational Data Model o Dimensional Data Model  Star Schema  Snowflake Schema • Database Language o SQL o DDL, DML, DCL • Type of Join o INNER, (FULL/LEFT/RIGHT) OUTER, CROSS o Equi-join, Non Equi-join • Data modeling o Entity Relationship o Cardinality o Granularity o Optionality • Best Practice on Data Model Design for BI o ODS (Operational Data Store) o DW (Data Warehouse) o STG (Staging Zone) o CT (Control Table)
  3. 3. Previous Session Recap
  4. 4. How to get started deploying BI?
  5. 5. BI Deployment • Collect Business Requirements / Needs / Drivers • Confirm BI Project Scope • Turn into Functional Specification • Determine Hardware Specification (i.e. CPU, RAM, HARDDISK) • Decide DR Strategy • Commit Resources (i.e. Sponsor Funding, User Engagement, Hardware Availability…..) • Select BI Tool (e.g. IBM Cognos…….) • Select BI Consultant • Post-Implementation Arrangement (User Training and Ongoing Maintenance)
  6. 6. • Conversion from Business Model • Each Data Model has a Specific Purpose o For Example: Generic Use or Departmental Use • It shows the interrelationship between Tables • Each Table should has a Specific Business Meaning o For Example: Sales Figures, Customer Information • Methodology to construct the data o Relational Data Modeling o Dimensional Data Modeling Data Model
  7. 7. Business Model • Operation Systems aim at helping Business Processes running smoothly • Operational Database is used to store data from Operation Systems • Multiple Business Processes = Business Model
  8. 8. • MUST be occur at Relational Database • A Relational Data Model comprises of tables, columns and relationships • Transactional-based • Detailed Level of Transactional Data • SQL is used for Query Relational Data Model
  9. 9. • Can be occur at Relational Database • A Dimensional Data Model comprises of Cubes, Fact Tables and Dimension Tables • Analytical-based • Summary Level of bulky Transactional Data • MDX is used for Dimensional Data Source while SQL can be used for OLAP Over Relational Data Source • Two major kinds of schemas are used o Star Schema o Snowflake Schema Dimensional Data Model
  10. 10. • Is a Relational Database Schema for representing Multidimensional Data • Every Dimension Table must have Primary Key • All Levels are stored into the same table within its Dimension • Consists of a Central Fact Table that is surrounded by multiple Dimension Tables • Stores all attributes for a Dimension into one denormalized (“flattened”) table. Star Schema
  11. 11. • Extension of Star Schema • Dimensional Table is normalized into Multiple Lookup Tables, each representing a level in the Dimensional Hierarchy • Consists of a Central Fact Table that is surrounded set of Dimension Tables, where the Parent Table of the set of Dimension Tables is connect to the Fact Table with its Primary Key Snow Flake Schema
  12. 12. • Star Schema o Less joins required o --> Higher Performance • Snowflake Schema - o Redundancy is reduced o --> Data Optimization Star VS Snowflake
  13. 13. • Bill Inmon is known as the Father of Data Warehousing • He defined a model to support Single Version of Truth and championed the concept for more than a decade Father of Data Warehousing
  14. 14. • In most cases, Ralph Kimball recommends Star Schemas are a better solution. Although redundancy is reduced in a normalized snowflake, more joins are required. • Kimball usually advises that Data Warehouses MUST be designed to be Understandable and Fast Father of Dimensional Modeling / Father of Business Intelligence
  15. 15. • Inmon’s philosophy recommends to start with building a large Centralized Enterprise-Wide Data Darehouse, followed by several satellite databases to serve the analytical needs of departments (later known as Data Marts). Hence, his approach has received the “Top Down” title • Kimball’s philosophy recommends to start with building several Data Marts that serve the analytical needs of departments, followed by “virtually” integrating these data marts for consistency through an Information Bus. Hence, his approach received the “Bottom Up” title Philosophy between THEM
  16. 16. Exercise
  17. 17. Question 1 What is the difference between Data Warehouse and Data Mart in your mind right now?
  18. 18. Answer to Question 1 Data Warehouse • By Enterprise-wise • Can always be easily to incorporate with Corporate Strategy __ _ • Only one Data Mart • By Departmental/Subject • Can be easily to assist the Business Strategy Formulation and Monitor its results • Can be more than ONE
  19. 19. • Communication Language Between YOU and Database • Abbreviation of Structured Query Language • SQL is a standardized query language for requesting information from a relational database What is SQL
  20. 20. • DDL - Data Definition Language • Define the Database Structure or Schema o For Example  CREATE  ALTER  DROP  TRUNCATE DDL
  21. 21. • DML - Data Manipulation Language • Retrieve and Manipulate data o For Example:  SELECT  INSERT  UPDATE  DELETE  MERGE DML
  22. 22. • DCL - Data Control Language • Control the Security and Permissions of the objects or parts of the database(s) o For Example:  GRANT  DENY  REVOKE DCL
  23. 23. • INNER JOIN (With Condition) o Returns all rows when there is a match in BOTH tables • OUTER JOIN (With Condition) o LEFT JOIN - Return all rows from the left table, and the matched rows from the right table o RIGHT JOIN - Return all rows from the right table, and the matched rows from the left table o FULL JOIN - Return all rows when there is a match in ONE of the tables • CROSS JOIN (Without Condition) o Returns all rows which combine each row from the first table with each row from the second table (No. of Resulting Rows = No. of Row of 1st Table * No. of Rows of 2nd Table) Type of Join
  24. 24. • Equi-join o Join condition containing an equality operator  = • Non Equi-join o Join condition not containing an equality operator  e.g. >, <, >=, <=, between Join Condition
  25. 25. Exercise
  26. 26. Background Information Sample Tables Note: in Customer table (predetermined "left table"), the customer " Wong" has not been assigned to any city, and also no customer is assigned to the "Washington" city. Customer table City table
  27. 27. Question 1 • If ALL the records of the Customer table are retained even if NO cities are assigned to him/her. Which JOIN type should be used?
  28. 28. Answer to Question 1 • LEFT JOIN keeps all the records of the left table: Cutomer table, even if there are no cities are assigned to "Wong". Left Joined Table
  29. 29. Question 2 • If ALL the records of both Customer table and City table are desired in one single table without duplication. Which JOIN type should be used?
  30. 30. Answer to Question 2 • FULL JOIN shows ALL the records of both left and right tables, even if lacking of matching records in each other. Full Joined Table
  31. 31. Question 3 • Only Cutomer Records who have the Assigned City and City Records which have the Assigned Customers are desired. Which JOIN type should be used?
  32. 32. Answer to Question 3 • INNER JOIN shows only the matching records which satisfy the predict condition in joined table. Inner Joined Table
  33. 33. Question 4 • What will happen when applying CROSS JOIN to Customer table and City table, how many records will appear in the joined table. Customer table City table
  34. 34. Answer to Question 4 CROSS JOIN applies NO filter conditions so it returns all the 24 (6 records in Customer table * 4 records jn City table) records as the result of production. Cross Joined Table
  35. 35. Question 5 • If ALL the records of the City table are retained even if NO customers are assigned to this city. Which JOIN type should be used?
  36. 36. Answer to Question 5 • RIGHT JOIN keeps all the records of the right table: City table, even if there are no customers are assigned to "Washington". Left Joined Table
  37. 37. • Essential Elements of Data Modeling o Entity-relationship - association between the tables o Cardinality - data occurrences of the relation  one to one  one to many  many to many o Granularity - refers to the level of detail stored in a table o Optionality - properties of data fields (mandatory or optional) Data Modeling
  38. 38. Data Modeling Step-by-Step • Step 1: Collect Business Requirements and Implement Business Process Mapping • Step 2: Identify the Grain • Step 3: Identify the Dimensions • Step 4: Identify the Measures • Step 5: Implement the Model Design • Step 6: Verify the Model • Step 7: Deploy the Model
  39. 39. Data Modeling Tools • A tool which is easily for Data Architect or Data Modeler to build the Data Model in their Computers • Can apply directly the Physical Data Model into the Destination Database via ODBC, JDBC or by DDL Statement Generation
  40. 40. • Divided into 4 Physical/Logical Partitions in Database Server Instance • ODS - Operational Data Store • DW - Data Warehouse • STG - Staging Zone • CT - Control Table • In Reporting Layer from BI Tools like IBM Cognos • Database Layer • Physical – Directly Imported Tables from DB • Logical – SQL, View or Stored-Procedure • Security – Optional. Define Security • Business Description Mapping Layer – Add Business Description • Dimensional Layer – DMR or OOR • Presentation Layer – Group by various Subjects, Departments or Specific Purposes Best Practice on Data Model Design for BI
  41. 41. • Contains the Snapshot of the operational system • Integration of data from different data sources • Data inputs from operational sources periodically • Historical Data of operation system can be kept in ODS • It is an interim place of DW ODS
  42. 42. • Designed in Star Schema or Snowflake Schema • All the data are extracted from ODS • Data are transformed according to business requirements • Consists of Dimension Tables and Fact Tables DW
  43. 43. • Storing the data from the sources other than the operating system (E.g. Excel, CSV) • Storage Area between ODS and DW STG
  44. 44. • Storing Variables or Parameters that can be used in whole Data Warehouse • For Example o Selected Date o Is Full Load CT
  45. 45. Business Model to Data Model
  46. 46. Business Model to Data Model (Cont’d) Transaction Detail Store ID Trans. Date Trans Ref. Product No. Product Name Price 3013 2007-11-27 09390 088590917667 IPOD CL 80GB 259.83 3013 2007-11-27 09390 060538892509 PROTECTION PLAN 48.84 3013 2007-11-27 09390 088590918750 IPOD NANO 4GB 154.83 3013 2007-11-27 09390 060538892509 PROTECTION PLAN 48.84 3013 2007-11-27 09390 060958513348 PHILIPS 1GB LK 39.88 3013 2007-11-27 09390 060538892466 PROTECTION PLAN 29.84 Transaction Master Store ID Trans. Date Trans Ref. Subtotal GST PST Total 3013 2007-11-27 09390 582.06 34.92 46.56 663.54
  47. 47. Physical Data Model PK/FK Shop ID VARCHAR(4) NOT NULL Shop Name VARCHAR(50) NOT NULL Shop Dimension PK/FK Date DATE NOT NULL Year VARCHAR(4) NOT NULL Month VARCHAR(2) NOT NULL Day VARCHAR(2) NOT NULL Date Dimension PK/FK Product No. VARCHAR(12) NOT NULL Product Name VARCHAR(50) NOT NULL Product Dimension PK/FK Transaction Reference VARCHAR(4) NOT NULL FK Transaction Date DATE NOT NULL FK Store ID VARCHAR(4) NOT NULL Subtotal NUMBER(18,2) NULL GST NUMBER(18,2) NULL PST NUMBER(18,2) NULL Total NUMBER(18,2) NULL Transaction Master Fact FK Transaction Date DATE NOT NULL FK Transaction Reference VARCHAR(4) NOT NULL FK Store ID VARCHAR(4) NOT NULL FK Product No. VARCHAR(12) NOT NULL Price NUMBER(18,2) NULL Transaction Detail Fact 1..1 1..1 1..1 1..1 1..1 1..n1..n1..n 1..n 1..n
  48. 48. Exercise
  49. 49. Exercise - Physical Data Modeling • Please prepare a physical data model from the given receipt or invoice.
  50. 50. Question 1
  51. 51. Answer to Question 1
  52. 52. Answer to Question 1 1..n 1..1 1..1 1..1 1..1 1..1 1..1 1..1 1..11..1 1..1 1..1 1..11..1 1..n 1..n 1..n 1..n 1..n 1..n 1..n 1..n 1..n 1..n 1..n 1..n
  53. 53. MDX • Multi-Dimensional Expression • Get the Intersection Point between Column and Row • Achieve Time Period Analysis easily (e.g. YTD, Period to Period Analysis) SQL • SELECT SUM([Sales Revenue]) SALES_REVENUE FROM SALES_TABLE WHERE Year=‘2013’ and Country=‘Hong Kong’ MDX • SELECT tuple([Sales Revenue],[2013],[Hong Kong]) ON ROWS FROM SALES_CUBE
  54. 54. MDX Functions (Extracts) Previous Month / Last Year Same Period • parallelPeriod ( level [ , integer_expression [ , member ] ] ) Previous Year / Previous Month • lastPeriods ( integer_expression , member ) YTD / MTD • periodsToDate ( level , member )
  55. 55. Do you understand the below MDX?
  56. 56. Popular BI Tools in the Market Business Intelligence Tool Vendor IBMCognos BI IBM Microstrategy Microstrategy Pentaho BI suite (open source) Pentaho JasperSoft (open source) JasperSoft WebFOCUS Information Builders Microsoft Business Intelligence (Excel + SSRS + SSAS + MOSS) Microsoft QlikView QlikTech SAS Enterprise BI Server SAS Institute Tableau Software Tableau Software Oracle Enterprise BI Server (OBIEE) Oracle Oracle Hyperion Oracle BusinessObjects Enterprise SAP SAP NetWeaver BI (Powered by HANA) SAP
  57. 57. Remember to choose the Best Business Partner instead of Software Vendors

×