2. 2-Session Knowledge Sharing Outline
Session 1
• What is Business Intelligence
• What is Dimension?
• What is Measure?
• Type of Dimension
o Degenerate Dimension
o Role-Playing Dimension
o Slowly Changing Dimension
Type 1
Type 2
Type 3
• Database Structure
o Tables
o Columns
o Data Types
o Constraints
o Keys
Session 2
• Data Model
o Relational Data Model
o Dimensional Data Model
Star Schema
Snowflake Schema
• Database Language
o SQL
o DDL, DML, DCL
• Type of Join
o INNER, (FULL/LEFT/RIGHT) OUTER, CROSS
o Equi-join, Non Equi-join
• Data modeling
o Entity Relationship
o Cardinality
o Granularity
o Optionality
• Best Practice on Data Model Design for BI
o ODS (Operational Data Store)
o DW (Data Warehouse)
o STG (Staging Zone)
o CT (Control Table)
3. What is Business Intelligence
• In General, BI can be Tools, Process and
Methodology of using Information
• Highly used in Performance Management and
Strategic Management
• Data Mart and Data Warehouse are NOT a MUST
to have for BI Capability, but like to have
• Most importantly, BI describes how you make use of
Information and the way to process Data
4. Performance Management
• To describe how an organization leverages its resources to
achieve short, medium and long-term
strategic goals
Suggested Management Tools
• Balanced Scorecard - Performance
Achievement
• SWOT Analysis - Internal Performance
of an Organization + PEST Analysis &
Five Forces Model for OT
• BCG Matrix - Two Dimensional
Analysis on Performance of SBU’s
(Strategic Business Units)
8. Strategic Management
Everything in Performance Management
Internal Environment
• Data Mining - Patterns Recognition via Statistics and Machine-
learning
• Predictive Analysis - Based on Patterns of Interests and Underlying
Relaionship between Data Sets to forecast the future performance
External Environment
• Social Media Analytic - Collect Customer Feedback from various
Social Media sites for Voice of Customer (VOC) or known as
Critical to Customers (CTC)
9. Strategic Management
• Strategy Formulation
• Corporate Strategy (Mission Statement)
• Business / Competitive Strategy
• Strategy Implementation
• DMAIC for Improvement
• DMADV for New Achievement
• Commitment of Resources
• Good Project Management (PRINCE2 / PMP)
• Benefits Review
11. Social Media Analytics
• Opinions and Experiences are expressed on Websites, Forums,
Blogs and SNS Medium (Social Networking Services)
• Social Media Analytics gives you the THE POWER TO
KNOW your customers' opinions so you can create better
communications, identify issues and spot trends sooner than
competitors
12. Social Media Analytics
• Two main approaches can be used to perform Sentiment
Analysis or Text Mining:
o Knowledge-based Approach, which uses linguistic models
to classify sentiments
o Learning-based Approach, which uses machine learning
techniques to classify text. The concept of sentiment
analysis opens a great number of possibilities and
opportunities for introducing BI strategies to analyze the
enormous amount of data flowing through the Web
13. Social Media Analytics
• Evaluate sentiment and monitor changes over time. SMA automatically extracts
sentiments in real time or over a period of time with a unique combination of statistical
modeling and rule-based natural language processing techniques
• Identify feedback sources to define new targets. By actively monitoring internal
collections (such as call centers and the Web) combined with social networking sites (like
Twitter and Facebook), SMA shows where you're being discussed and what is being said.
It automatically extracts feedback, filtering out the most important ideas and concepts
• Continuously improve customer experience and competitive position. SMA searches for
and evaluates internal and external contents about your organization and competitors,
identifying positive, negative, neutral and "no sentiment" texts – quantifying perceptions
in the market
• Promote ongoing, integrated analysis environment. With ongoing evaluations, you can
refine models and adjust classifications to reflect emerging topics and new terms
relevant to your customers, organization or industry
15. Quality Function Deployment
(QFD)
• Also know as House of Quality
• QFD relates the CTQ to the CTC
• QFD is a methodology by
which Voice of Customer
(VOC) is converted to design
parameters (or CTQ) for
developing a new process
or product
21. Market Penetration Strategy
• Increase Market Share
• Secure Dominance of Growth Markets
• Make Market Unattractiveness to Competitors supported by
Pricing Strategy
• Increase Customer Loyalty
22. Market Development Strategy
• Find a new Geographical Markets
• Create mew Product Dimensions or Packaging
• Establish new Distribution Channels
• Determine new Pricing Strategies
• Redefine new Market Segments
23. Product Development Strategy
• Determine Critical to Quality (CTQ)
• Create Product Values to Customers
• Do more R&D and Innovation
• Always being first to Market
24. Diversification Strategy
• Access the Risk and confirm the Risk Value
• Provision to Risk Occurrence
• What is going to achieve from this Strategy
• Get the Right Balance between Risk and Reward
25. Risk Probability & Impact
• Risk Value = Probability of Occurrence x Impact of Event
27. DMAIC
Define Define the Goals and Objectives
(Set KPI and Confirm the Scope)
Measure Measure As-Is Process
Performance
Analyze Analyze and Determine the Drivers
and Root Causes (Determine Action
Plans)
Improve Improve the Process Performance
(Execution of Plan)
Control Monitor and Control the Process
Performance
28. DMADV
Define Define the Goals and Objectives (Set KPI and
Confirm the Scope)
Measure Measure As-Is Process Performance
Analyze Analyze and Determine the Drivers and Root
Causes (Determine Action Plans)
Design / Trial Run • Design the Process to meet the Goals and
Objectives
• Sometime we may trail run the Process and
collect the results in the next Stage
Verify • Verify the Process and Simulate the Results
• Collect the Trail Run Results and Compare it
with Goals and Objectives
29.
30. Orange Inc.
Background Information
• Is a Hong Kong Brand, found in 2010 and headquartered in Hong
Kong
• Research-and-Develop and Sell own-branded Notebooks, Phones
and Tablets
• Has three Retail Shops in Hong Kong
• Has one Factory in Mainland China
• Has one Distributor in Hong Kong
(i.e. Wideway Electronic)
31. Orange Inc.
• We are now in August 2013
• Orange found that the Average Product Life Cycle of
Consumer Electronics is ranged from 6 months to 18
months
• First-time to hold a one-week Clearance Sales event in
coming December
• Select Poor Sales Performance Products for Closeout
32. Poor Sales Performance Products
• For Stocked Products only
• Bottom 20 Sales Contribution
• 7th Month of Product Launched to 18th Month of Product
Launched
• The First Round of Selection is on Products that contains 18 sales
period
• Hence, if there is less than 20 products that has 18 sales period,
then select products that has only 17 sales periods …….
• Keep the selection on until all 20 products are selected as a
Bottom 20
33. Poor Sales Performance Products
Launch Month
Product A Oct-11
Product B Jan-12
Product C Jan-13
35. Poor Sales Performance Products
• Sales Contribution of Product A in 1st Month
Total Sales of Product A in 1st Month / Total Sales of all Product in 1st
Month
• Sales Contribution of Product A in 7th Month
Total Sales of Product A in 1st to 7th Month / Total Sales of all Product in
1st to 7th Month
• Sales Contribution Performance of Product A (7th to 18th Month)
• Take Average on Each of the Sales Contribution
36. Question
• During the Preparation of Clearance Sales, how can we
make use of the Information we manipulated?
• Hence, what are the suggested strategies coming next?
37.
38. Suggested Solution
on As-Is Performance Analysis
• First of all, use the concept of BCG Matrix to define all
Products in the following SBU Criterion
• By Sales Contribution, By Growth Rate, By Sales Margin
• Conduct SWOT Analysis supported by Weighted Score
• Current Ratio
• Quick Ratio
• Selll-through Rate
• Average Sales Margin
39. What is Data Warehouse
• Congregate data from multiple sources into a single database
so a single query engine can be used to present data at a
Single Version of Truth environment
• Maintain data history, even if the source transaction systems
do not
• Improve Data Quality, by providing consistent codes and
descriptions, flagging or even fixing bad data
• Present the organization's information consistently
• Restructure the data so that it makes sense to the business
users
40. Reporting vs Business Intelligence
• While Reports can provide users with critical information
necessary for decision making, they are often lengthy and difficult
to understand, due to the amount of detail they contain
• The rapid explosion of data in business today can leave
organizations struggling to cope with the huge volume of
transactional data
• Reports only tell you WHAT happened – what they don’t tell you is
WHY did it happen
• Reports DON'T meet the needs of companies wanting to remain
competitive, agile, and alert to new opportunities and ways to
improve.
41. Why you need Business Intelligence
• Organizations need to transform transactions into an even
more powerful asset – Business Intelligence and Insight.
Looking at the same reports over and over does not
promote, advance and encourage a culture of proactive
management and continuous improvement
• Users need to be empowered to look beyond standard
reporting, and be able to proactively manipulate data to
uncover the valuable insights that may be hidden within.
42. What is Dimension
• Dimensions are built according to the Business
Perspective of the Organization
o For Example, Walmart: Store, Product, Warehouse
• Dimensions are a Common Way of Analysing Data
• Dimensions usually storing Textual or Descriptive
Data
• In general, Dimension contains Attributes
43. What is Measure
• Is Discrete Data or Continuous Data
• Evaluate and Project (Forecast) the Performance of
Dimensions
• Can be do Aggregreation
44. What is Fact
• A combination of Measures which describe the
Conformed (Common) Dimensions
• A record in Fact Tables describes the things that
• Is already happened
• Is alredy determined
• Is already decided
• Is alredy planned
53. • In a telephone manufacturer, what is the Percentage of Unit
Sold for each of the following Sales Channel of Retail Outlet,
Order Desk and Salesman by each Model last week?
Question 4
54. • Dimensions
Model
Retail Outlet (Sales Channel)
Order Desk (Sales Channel)
Salesman (Sales Channel)
Date
• Measures
Percentage of Unit Sold by Retail Outlet by Model
Percentage of Unit Sold by Order Desk by Model
Percentage of Unit Sold by Salesperson by Model
Answer to Question 4
55. • The District Manager would like to know the
Growth Rate of Unit Sold and Revenue between
two weeks by Store, by Product.
Question 5
57. • Susan ordered a 12 inches birthday cake from special
collection with 18 candles and delivery to Jenny's
home on 5pm today.
Question 6
58. • Dimensions
Susan (Customer Dimension - "Sales to")
Jenny's home (Customer Dimension - "Ship to")
12-inches Cake (Cake’s Dimension)
Candle
Special Collection (Cake’s Dimension)
Date (5pm today)
• Measures
No of Cakes
No of Candles
Answer to Question 6
59. Richard decided to rent an apartment and visited the real estate agency.
Question 7
I'd like to rent an apartment
around this neighbourhood,
and my ideal size is about 800
feet with no more than $20,000
a month.
We have lots of apartments
fitting your requirements, and
there is one of 770 feet,
charging $19,000 a month. Let
me show you.
Sure, see you tomorrow
It looks good! How many
bathrooms in this
apartment? That's great and
when can we start to
move in?
Let's make it
tomorrow 9am, is
that OK?
It is available for
renting from 1st,
September.
Real estate AgentRichard
There are two bathrooms to
make it more convenient.
That satisfies me
perfectly. When can
we sign the contract,
and pay the deposit?
60. • Dimensions
Richard
Real Estate Agent
800-feet Apartment (Apartment Size)
770-feet Apartment (Apartment Size)
Bathroom
Date (1st September & 9am Tomorrow)
$20,000 a month
$19,000 a month
Answer to Question 7
• Measures
No of Bathrooms
61. Degenerate Dimension
• A dimension that has NO designated attributes, and
is not necessary to be treated as a separate
dimension, because all the interesting attributes
have been placed in other business perspective
(dimensions)
• Usually, it is a Transactional-based number which
resides in the fact table, e.g. Order Number
63. Question 1
There is a set of invoice numbers like this, 1xxx, 2xxx, 3xxx, the
first digit indicates the types of the order:
1-> "Sales by Phone"
2-> "Sales by Fax"
3-> "Sales by Internet"
Is Invoice number Degenerate Dimension?
64. Answer to Question 1
• No
• Since The Prefix of Invoice Number has a Business
Meaning and possible be a Business Perspective
65. Role-Playing Dimension
• A single dimension which is expressed differently in
a fact table
• It has multiple relationships between itself and
another table, and it is commonly seen in Date
Dimension and Customer Dimension
• Apply role-playing dimension to avoid creating two
or more separate tables, but create views from a
single table
67. • The package was sent from Hong Kong (June 15th)
to Peter’s Home in New York (June 20th) by Speedy
Logistic Company, please identify the role-playing
dimensions in this scenario.
Question 1
68. Geographic Dimension & Date Dimension
are Role-playing Dimensions.
Answer to Question 1
Shipping Date
Year
Quarter
Month
Week
Receiving Date
Year
Quarter
Month
Week
Order Fact
Date Key
Geographic Key
Product Key
Shipper Key
…
Shipping Date Key (Date Key)
Receiving Date Key (Date Key)
Ship From Key (Geographic
Key)
Ship To Key (Geographic Key)
Ship From
Country
State
City
Ship To
Country
State
City
69. Slowly Changing Dimension
• Dimensions Attributes that change over Time are called
Slowly Changing Dimension (SCD)
• SCD is the approach to keep track of the changes of
attributes over Time
• Fundamentally, we have Type 1, Type 2 and Type 3
70. • Dimensions change slowly (comparatively) over a
period of time, rather than change on a time-based,
regular schedule.
• To maintain changes, there are 3 approaches to
handle such changes:
o Type-1: Overwrite the existing value.
o Type-2: Track Changes by Row
o Type-3: Track Changes by Column
Slowly Changing Dimensions
71. SCD Type 1
• Overwrite the old values
• No historical state, value or snapshot of attributes are kept
Customer ID Customer Name Gender TOP 30 Focus Customer
C001 Mary Wong F N N
C002 Peter Lam M Y N
C003 Sally Chan F N Y
Customer ID Customer Name Gender TOP 30 Focus Customer
C001 Mary Wong F N N
C002 Johnny Lam M Y N
C003 Sally Chan F N Y
72. SCD Type 2
• Create additional record with new attributes
• Records must contain Effective Date
• A Full History of attributes' changes are stored
Customer ID Customer Name Gender TOP 30 Focus Customer Effective Start Effective End
C001 Mary Wong F N N 1/1/2001 12/31/2099
C002 Peter Lam M Y N 1/1/2001 12/31/2099
C003 Sally Chan F N Y 1/1/2001 12/31/2099
Customer ID Customer Name Gender TOP 30 Focus Customer Effective Start Effective End
C001 Mary Wong F N N 1/1/2001 12/31/2099
C002 Peter Lam M Y N 1/1/2001 4/22/2013
C002 Peter Lam M Y Y 4/23/2013 12/31/2099
C003 Sally Chan F N Y 1/1/2001 12/31/2099
73. SCD Type 3
• Changes of Attributes are handled in another Column(s)
o Which Attributes are needed to be tracking
o Number of Changes must be determined
Customer ID Customer Name Gender Salesman ID TOP 30 Focus Customer
C001 Mary Wong F S001 N N
C002 Peter Lam M S001 Y N
C003 Sally Chan F S002 N Y
Customer ID Customer Name Gender Salesman ID 1 Salesman ID 2 Salesman ID 3 TOP 30 Focus Customer
C001 Mary Wong F S001 S002 S003 N N
C002 Peter Lam M S001 S001 S001 Y N
C003 Sally Chan F S002 S002 S002 N Y
75. • There is one record of the customer like this:
After confirmation with Peter, the gender was
wrongly recorded.
Ask: What type of slowly changing dimensions
technique should be chosen? and How to do that?
Question 1
76. Answer to Question 1
Before Change
After Change
Tip: As for the gender was wrongly recorded, there is no
reason to keep history of this kind of information, we
select Type-1 to handle this change in general.
77. • Elaine firstly joined on 1st Jan 2009 and was
transferred to "Dress" department on 15th Jul 2013.
Ask: How to record this change using SCD Type 2? and
What is the benefits of using SCD Type 2?
Question 2
78. Before Change
Answer to Question 2
After Change
SCD Type 2 can have a full picture of her Servicing
Department Information and thus it is easy for
Management to keep track her Workload Performance
79. Question 3
• There is an example to record the change of the supplier of one
material using SCD Type 2, please alter the below SCD Type 2 style
into SCD Type 3 style. The Requirement has to keep ONE set of
current Supplier code and TWO sets of Supplier Code History
What about extra changes of supplier from "500110" to "500113",
from "500113" to "500107"?
80. Answer to Question 3
Using Type-3, we add one column (or more columns) to
keep history of the change, like this:
81. Question 4
• Company X just closed its Branch #5, and all the
employees will be transferred to Branch #7, All
Properties in Branch #5 will be sold out.
Ask: Please use Type-1 to handle changes.
83. What is Database
• Database is a Collection of Data which is organized
in the way that can be easily accessed, managed and
updated
• While Data Warehouse stores data in a more
meaningful and business perspective way
• E.g. EXCEL, ACCESS, SQL Server, Oracle and
DB2....etc
84. Database Structure
• In IT point of view, Database comprises Tables
• Each Table can have various Data Fields with
corresponding Data Type
• Each Table can have Key(s) for Unique Identifier and
Foreign Key (Mapping Key)
• Each Tables stores records by Row basis
• Each Data Field can have different Constraints
85. Main Data Fields
In General Database
• VARCHAR
• INTEGRER
• NUMERIC
• DATE
In EXCEL
• TEXT
• NUMBER
• NUMBER
• DATE
86. Main Constraints
In General Database
• CHECK
• NULL / NOT NULL
• UNIQUE
• PRIMARY KEY
• FOREIGN KEY
In EXCEL
• Data > Data Validation
• Data > Data Validation
• POWERPIVOT
• POWERPIVOT
• POWERPIVOT
87. Key Definition
• Candidate Key - Potential column that can uniquely identifies the
corresponding row
• Primary Key - Column that have been assigned PK that as a unique
identifier of the corresponding row
• Alternative Key - Any Candidate Key that have not been assigned
as PK
• Composite Key - A combination of columns that can uniquely
identify the corresponding row
• Foreign Key - The Other table Primary Key. Consists of one one or
more columns. We can also interpret it as a Mapping Key between
Tables
88. Form of Candidate Key
• Natural Key - Has real meaning to the
corresponding row and can act as a Unique
Identifier
• Surrogate Key - System-generated Key that can
uniquely identify the corresponding row with no
meaning
89. Dimension Key
• It is a Foreign Key (Mapping Key) to Dimension
• Appear in Fact Tables
• The Primary Key of other Dimension Tables
90. Thank you for YOUR TIME
Remember to choose Professional Consultants
instead of
Software Vendors