SlideShare a Scribd company logo
1 of 49
-
1
Data Warehousing
Case study creating a
Data warehouse for super market
Ch Anwar ul Hassan (Lecturer)
Department of Computer Science and Software
Engineering
Capital University of Sciences & Technology, Islamabad
Pakistan
anwarchaudary@gmail.com
-
2
Case Study- Introduction
Introduction: 15 years back Information technology gave a gift to this
world,-E-Commerce. Since than every small and big business has used it
to improve its outreach, customer count, sales, profit and each possible
aspect. But this was not sufficient. As data grew from MB to GB to PB,
these smart business felt a to store this data efficiently and to utilize it for
improving various aspect of business. One such domain is retail where
customer are products are key aspect. Which product is needed by what
type of customer and when are the key questions of retail business. If they
are answered well the can take retail business to new heights. In solving
these queries Data Warehouse plays an important role. It helps to analyze
key aspects to improve sale of retail stores. To know what customer
(customer location) buys and in which season, we need to have a look
over the whole data. So first we need to collect the whole historical data in
one place in a standard format. This is done by preparing data ware
house. There are many software which helps in this like Teradata, Netezza,
Oracle, Hadoop etc. Once the warehouse is prepared we can use this
dataset in many ways to answer endless queries. In this project I have
simulated the real time data warehouse preparation and answering
business queries.
-
3
Data Sources:
 Data is the basic requirement of any data warehouse. Data
for this data warehouse is collected from three different
datasets. The first one is from a Global supermarket store,
from which I took data of five different stores from different
locations of USA for year 2012. Second is the revenue
collection from each store in each month. Third one
signifies which month fall in which season in USA.
 Three of the dataset are easily coerced together as all the
dataset have same month_id in each dataset which is
used in sql query for lookup to populate data in fact table.
-
4
Data Sources:
 Data source 1- This dataset had been fetched from www.Kaggle.com . Kaggle
is a repository of thousands of data set. This dataset contain data of
supermarket of whole globe. The data used in this data warehouse is of five
different state of USA which are New York, New Jersey, New Hampshire, Utah,
Texas. Link of the dataset:
https://kaggle2.blob.core.windows.net/datasets/1048/1903/global_superstore_20
16.xlsx.zip?sv=201 5-12-
11&sr=b&sig=V6MbJAh5QVwQC8wLLiPrsC8dKochxZ354VLclEnFuWM%3D&s
e=201704-07T08%3A21%3A15Z&sp=r
 Data Source 2- This dataset is a dummy dataset which is generated by
mockaroo. This dataset contains the revenue of each month for each state.
https://www.mockaroo.com/
 Data Source 3- This is the unstructured data set which I Scraped from the site:
https://www.englishclub.com/vocabulary/time-months-of-year.htm This data has
been uploaded into excel which looks like as shown in Fig.3.1 which is cleaned
in and made structured as shown in Fig.3.This dataset have seasons of USA.
-
5
Have you opened the link?
 Please take the data sources
screenshot and share in chat …..
-
6
Have you opened the link?
 Arrange the data…
-
7
Data Warehouse Design and
Architecture:
To carry out the analysis of retail store in different
state of USA like how much is the revenue
generation, amount of product sold in what month
and in which season Kimball’s approach is used to
build this Data Warehouse.
Design Tool for this Data Warehouse:-
 Sql Server Management Studio
 Sql Server Integration Services
 Sql Server Analysis Services
-
8
Data Warehouse Design and
Architecture:
I have followed the Kimball’s architecture which consist of the
following procedures :-
Identification of the Process of Business:- We need to
define the main process of business like acquiring customer,
acquiring the products, then sale process. We also need to
understand at what level sales data is summarized. Whether it
is daily, weekly or monthly level. This step helps in
determining the entities and their relationship as per business
requirement. Later on these entities becomes the dimensions
of the business. The most important entities are Cusotmer,
Product, Location, and time.
-
9
Data Warehouse Design and
Architecture:
• Defining the Grain:- Grains mean at what depth we need to
store the data for these dimension. It defined the granularity of
the system. In this project we are going to store sales of the
product at month level.
• Defining the Dimensions :- Once entities and grains are
decided we can decide the dimension. This dataset contains
five dimensions
-
10
Data Warehouse Design and
Architecture:
• Deciding the fact of the Data Warehouse:-Fact table
defines the measurable data we are going to store for the
dimesions. It is the pivot of star schema which contain all
the primary keys of dimensions and the measurable
quantities which are used to carry out business queries.
This fact data is designed in such a manner that it helps
in identifying which is our regular customer, how to
improve retail business as each season have variation in
selling of product, how much revenue is generated in
which state and last but not least which is the highest
selling product.
-
11
Advantage of Kimball’s Model:
Kimball model has slight different approach to build data
warehouse as it follows bottom up approach which help in
merging small datasets.
• Performane of Kimball model is better
• More focus is on Dimension which play important role for
analysis
• Focus of this approach is on the process of Building DW
• Less time consuming in creating the DataWarehouse
-
12
Overview of building data warehouse to
carry out Business intelligence queries:-
• In SSIS (Sql Server Integration Services) package ETL is done
three of the datasets are in excel sheet which are
extracted into the staging table.
• From staging table data is populated into the Dimensions
table.
• with the help of lookup tool(join) data is being populated
into the fact table.
• Cube is deployed in SSAS (Sql Server Analysis Services)
• Business queries are carried out in power BI
-
13
Overview of building data warehouse to
carry out Business intelligence queries:-
-
14
Star Schema
Star Schema: Star Schema looks like a star in which Fact Table act as a
pivot as it resides in the center, while multiple Dimensions are attached to
the fact table in a star like form having concepts of Foreign key. A simple
Star Schema usually have one Fact Table and multiple Dimensions but a
complex Star Schema can consist more than one Fact Table. Generally,
Fact Tables are in 3NF.
Fact Table: Fact Table consist two type of column;
(i) Measure columns
(ii) Foreign key column.
Measure columns consist of numeric values that can be measured or
count while foreign key column consist of column which act as primary
key in dimension tables. Measure column can be used in form of
aggregation or without aggregation for analysis of Business query.
-
15
Star Schema
• Dimension Table: Dimension table consist of Textual and
descriptive values. Each dimension Table have their own
primary key which is a unique table represent other column
values. The surrogate column known as foreign key column
in Fact Table is nothing else but they are the Primary key
column of Dimension Table
-
16
Advantage of Star Schema
Star schema has various merit which prove its efficiency as
well as its specialty in building a Data warehouse.
• Easy to generate an ETL process
• Complexity is low as table query has direct relationship
• Decrease the headache of Normalizing, as data in
dimension tables is stored in normal form
• It is very efficient to carry out metric analysis
• Each Dimension table is directly connected to Fact Table
• Navigation of Data is fast as of the nature of connection of
fact and dimension table.
-
17
Design of Data Warehouse: Dim_Table:
For this Retail Data warehouse five dimensions and one fact table have been
created.
Dim_Customer: Customer dimension consist of Customer name, Customer id,
Customer key. Customer key is the primary key in this dimension. It is generated
when we I create the dimension by entering query [Customer_Key] INT Identity
(1,1)PK. Now the question is why I generated this, as I was already having
customer_id. As the primary key should be unique, none of the value should be
repeated but as the customer is repeated their id will also repeat and that won’t
make the column unique,so to remove this redundancy Customer_key as the
primary key of this dimension is auto generated. Customer_name contains the
name of customer and customer_id column contain the id of customer. With this
dimension we can analyse which one is our regular customer.
-
18
Design of Data Warehouse: Dim_Table:
Dim_Product: Product dimension has product_key as the
primary key. Product_id contain id of the products.
Product_name contain the name of product sold.With the help
of this dimension we can analyze which is the highest selling
product and which customer buys what product.
-
19
Design of Data Warehouse: Dim_Table:
Dim_Location: Location dimension contain Location_Key as
primary key. State_id is the id of state. State_name contains
the name of state of store location. Region name contains the
region of the country. This dimension is helpful in analyzing
which state or region have higest number of customer,which
state got highest sale. It will also help in analyzing the
revenue earned in each state or region.
-
20
Design of Data Warehouse: Dim_Table:
Dim_Source: This dimension is fetched from unstructured
dataset. It contain Season_key as primary key. Se_month_id
is the id of a particular month. This Dimension will help in
analyzing which month shows the highest sale and which
season has what highest selling product.
-
21
Design of Data Warehouse: Dim_Table:
Dim_Month: This dimension contains Month_Key as Primary
Key. S_month_id contain the id of particular month.
Month_name contain the month.This dimension can be used
in analyzing highest sale in a state according to month or
which is the highest sold product in a month.
-
22
Design of Data Warehouse: Fact_Table:
For our retail superstore we have created one fact table which
is connected with each dimension table with foreign key
relationship. It has three columns for measurement.
• product_quantity- It contains the product of quantity sold.
• total_sale- It contain the sale amount of customer visit
wise.
• revenue- It contain the amount of revenue generated in
the store month wise.
-
23
Star Schema of Project:
Dimension tables and Fact Table is connected together using
Star schema as shown in Fig 12.
-
24
Extract Transform Load(ETL) process
For Building a data warehouse the
important thing is extracting data, then this
data is transformed into the staging area
and lastly loaded in destination area. This is
known as ETL process. To carry out ETL
process for SSIS toolbox is used. In ETL
process data from the External source is
Extracted into the staging Database. Next
step is to carry Transformation stage.
Loading stage is the end of ETL
process in which data is loaded in fact table.
At the end of ETL process data is populated
in fact table as well as in dimension table as
shown in Fig.6.
-
25
Extraction:
Data is extracted from external source in this phase. For this
project excel sheets are the external source. Otherwise it can
be any database or OLTP server. This extraction will load the
data into the the staging database base, which is ole db
destination as shown in Fig 14.
-
26
Extraction:
All the data is extracted into the database from these excel
files. We can also see the data which comes in staging phase
is stored in the database as;
dbo.Main_Stage
dbo.season_stage
dbo.state_stage etc as shown in Fig.
-
27
Extraction:
A Truncate Query is written in staging phase so that no
multiple data is generated due to multiple run as shown in
Fig 16.
-
28
Transformation:
After the data is extracted from excel to staging database,
next step which is done is transformation.For transformation i
have used lookup tool(join) and sql query as shown in
Fig.19.2 for loading the data from dimension tables.
-
29
Transformation:
we have five dimension tables in our
data base and 1 fact table.
 dbo.Dim_Customer
 dbo.Dim_Location
 dbo.Dim_Month
 dbo.Dim_Product
 dbo.dim_Source
 dbo.Retail_Fact
These dimensions are shown in
Fig.17.Dimensions are one of the
important factor in analyzing data.
Mapping should not be mismatched as it
will terminate the ETL flow
-
30
Transformation:
Dimensions are one of the important factor in analyzing data.
Mapping should not be mismatched as it will terminate the
ETL flow
-
31
Loading:
After populating Dimension table next step is to populate Fact
table. Fact table contains all the primary key of the dimension
tables and some measurables which are used for analysis
purpose with some aggregation rule. Lookup tool (joins) is
used to populate the dimension table and Measures in fact
Table.
-
32
Loading:
-
33
Deploying the CUBE:
It is the phase to carry out multidimensional representation of
data with the help of cube in SSAS which is further use to
analyze the data on the basis of measures which are present
in fact table and the descriptive, textual data present in
Dimension tables.
Here, Project Cube is successfully deployed as shown in
Fig.20 & Fig.21. After deploying the cube, phase of analysis
and reporting start’s where Business intelligence query is
carried out.
-
34
Deploying the CUBE:
-
35
Business Analytics
Tool Used for Business Query-: Power BI
Power BI is used to carry out the analysis of this
Data Warehouse. For analyzing cube is imported in
power BI. with the help of descriptive, textual and
measurable quantity business queries have been
carried out.
-
36
Business Analytics
Following business query can be analyzed with the help of
our database
Case Study:1
 Does Seasons(summer, spring, winter, autumn) in 3
different regions of USA effect the retail store business in
term of revenue collection.
Case Study:2
 Sales generated in different states on basis of seasons
Case Study:3
 Analytical Targeting of customers
Case Study:4
 Seasons affecting the revenue of States
-
37
Case Study:1
Does Seasons(summer, spring, winter, autumn) in 3
different regions of USA effect the retail store business
in term of revenue collection
This Query touches all of the three dataset. To verify the
above Query we will take revenue, season name and region
name. Below Graph shows how much revenue is generated
in which region and in which season.
-
38
Case Study:1
-
39
Case Study:1 - Analysis
Analysis:
From the clustered bar chart representation we can analyze
that highest revenue is generated in summer season followed
by autumn, then by winter and spring is responsible for least
revenue in each region of USA. Graph also shows that in all
the seasons store earns most of its revenue from Eastern US
and Western season stood last. This graph give a quick
insight to marketing and sales team that they need work on
Western region to increase sales and find the reason of spring
being so slow.
-
40
Case Study:2
Sales generated in different states on basis of seasons
This Query is generated from all the three dataset. To predict
above query Total sale, State and Season is used. Below is
the pie chart Fig.23 represent sale of different states in
different season.
-
41
Case Study:2
-
42
Case Study: 2 - Analysis
Analysis:
This pie chart is used to analysis the sales of store in different
state in different season. As the Fig.23 shows that sale in
Texas in summer season is highest, followed by New York.
The pie chart shows that New York got highest sale in autumn
Season and is followed by Texas. So New York and Texas are
biggest buyers in any season. While rest of states are slow in
all seasons. So it seems state is very important factor in terms
of sales. We need to understand the needs of Western US
states which our store is not able to cater. Either we need to
change the products or increase some offers or may be store
manager is not very efficient. Season and State are very
important factor in US. The product which is suitable for New
York in Winter might not be suitable for Utah during same
time. This kind of variation is needed while planning store
products.
-
43
Case Study:3
Analytical Targeting of customers
To predicate the above query we need to check which
customer buys maximum number of products in which
season. Product quantity, Customer Name and season is
used for targeting specific customers.
-
44
Case Study:3
-
45
Case Study: 3 - Analysis
Analysis:
The Donut chart Fig.24 represent customer who buys
maximum number of products in four different season. Figure
explains which customer bought what quantity of product in
which season. According to the business point of view we can
target the specific Customer and provide some more offers to
improve our sales.
-
46
Case Study:4
Seasons affecting the revenue of States
This query also touches three of the dataset.To analyze the
above query we used seasons, revenue, states to check the
amount of revenue generated from each state in every
season.
-
47
Case Study:4
-
48
Case Study: 4 - Analysis
Analysis:
The above graphical representation Fig.25 shows how much
revenue is collected in each state in each season. New York
have generated highest amount of revenue in each
season.while New Hampshire have generated the least. In
perspective of business New York and Texas revenue
generation is significantly high.
-
49
Conclusion:
This data warehouse can help in depicting how we can target
specific customer in which region of the country. New York
and Texas have highest sale and highest revenue generation
while New Hampshire have significance less than each of the
other state.so to improve the sale in New Hampshire, Utah,
New Jersey. Seasons also play important role in retail
business as the sale in summer season is the highest of all.
with the help of this Data Warehouse we can also examine
which product is sold in which month so we can give some
extra offers on that particular product.

More Related Content

What's hot

Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension Sunita Sahu
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designSarita Kataria
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2akitda
 
Data Warehouse Project Report
Data Warehouse Project Report Data Warehouse Project Report
Data Warehouse Project Report Tom Donoghue
 
Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project reportsonalighai
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousingShahed Khalili
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemKiran kumar
 
04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6Prithwis Mukerjee
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseRussel Chowdhury
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data WarehousesMichael Lamont
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBeing Topper
 

What's hot (20)

Chapter 2 - Retail Sales
Chapter 2 - Retail Sales Chapter 2 - Retail Sales
Chapter 2 - Retail Sales
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
 
Retail Data Warehouse
Retail Data WarehouseRetail Data Warehouse
Retail Data Warehouse
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
Data Warehouse Project Report
Data Warehouse Project Report Data Warehouse Project Report
Data Warehouse Project Report
 
Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project report
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
04 Dimensional Analysis - v6
04 Dimensional Analysis - v604 Dimensional Analysis - v6
04 Dimensional Analysis - v6
 
Business Intelligence and Multidimensional Database
Business Intelligence and Multidimensional DatabaseBusiness Intelligence and Multidimensional Database
Business Intelligence and Multidimensional Database
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 

Similar to Intro to Data warehousing lecture 15

CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsDataBench
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operationst_ivanov
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGZaranTech LLC
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptxjainyshah20
 
Business intelligence an Overview
Business intelligence an OverviewBusiness intelligence an Overview
Business intelligence an OverviewZahra Mansoori
 
Datawarehousing with MySQL
Datawarehousing with MySQLDatawarehousing with MySQL
Datawarehousing with MySQLHarshit Parekh
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousingtheextraaedge
 
Brazilian Ecommerce OLIST Market Data Analysis
Brazilian Ecommerce OLIST Market Data AnalysisBrazilian Ecommerce OLIST Market Data Analysis
Brazilian Ecommerce OLIST Market Data Analysiskaushikdey53
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxhajon27910
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSparkflows
 
Building a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathyBuilding a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathySolmaz Shahalizadeh
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNabclearnn
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1guest9529cb
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
IS 2 Long Report Pardeep kumar 1271107
IS 2  Long Report Pardeep kumar  1271107IS 2  Long Report Pardeep kumar  1271107
IS 2 Long Report Pardeep kumar 1271107TouchPoint
 
LECTURE 7.ppt.pdf
LECTURE 7.ppt.pdfLECTURE 7.ppt.pdf
LECTURE 7.ppt.pdfcikajen791
 

Similar to Intro to Data warehousing lecture 15 (20)

Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAINING
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Business intelligence an Overview
Business intelligence an OverviewBusiness intelligence an Overview
Business intelligence an Overview
 
Datawarehousing with MySQL
Datawarehousing with MySQLDatawarehousing with MySQL
Datawarehousing with MySQL
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousing
 
Brazilian Ecommerce OLIST Market Data Analysis
Brazilian Ecommerce OLIST Market Data AnalysisBrazilian Ecommerce OLIST Market Data Analysis
Brazilian Ecommerce OLIST Market Data Analysis
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdf
 
Building a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathyBuilding a financial data warehouse: A lesson in empathy
Building a financial data warehouse: A lesson in empathy
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Afdal
AfdalAfdal
Afdal
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
IS 2 Long Report Pardeep kumar 1271107
IS 2  Long Report Pardeep kumar  1271107IS 2  Long Report Pardeep kumar  1271107
IS 2 Long Report Pardeep kumar 1271107
 
LECTURE 7.ppt.pdf
LECTURE 7.ppt.pdfLECTURE 7.ppt.pdf
LECTURE 7.ppt.pdf
 

More from AnwarrChaudary

Intro to Data warehousing lecture 20
Intro to Data warehousing   lecture 20Intro to Data warehousing   lecture 20
Intro to Data warehousing lecture 20AnwarrChaudary
 
Intro to Data warehousing lecture 19
Intro to Data warehousing   lecture 19Intro to Data warehousing   lecture 19
Intro to Data warehousing lecture 19AnwarrChaudary
 
Intro to Data warehousing lecture 18
Intro to Data warehousing   lecture 18Intro to Data warehousing   lecture 18
Intro to Data warehousing lecture 18AnwarrChaudary
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17AnwarrChaudary
 
Intro to Data warehousing lecture 16
Intro to Data warehousing   lecture 16Intro to Data warehousing   lecture 16
Intro to Data warehousing lecture 16AnwarrChaudary
 
Intro to Data warehousing lecture 14
Intro to Data warehousing   lecture 14Intro to Data warehousing   lecture 14
Intro to Data warehousing lecture 14AnwarrChaudary
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13AnwarrChaudary
 
Intro to Data warehousing lecture 12
Intro to Data warehousing   lecture 12Intro to Data warehousing   lecture 12
Intro to Data warehousing lecture 12AnwarrChaudary
 
Intro to Data warehousing lecture 11
Intro to Data warehousing   lecture 11Intro to Data warehousing   lecture 11
Intro to Data warehousing lecture 11AnwarrChaudary
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10AnwarrChaudary
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09AnwarrChaudary
 
Intro to Data warehousing lecture 08
Intro to Data warehousing   lecture 08Intro to Data warehousing   lecture 08
Intro to Data warehousing lecture 08AnwarrChaudary
 
Intro to Data warehousing lecture 07
Intro to Data warehousing   lecture 07Intro to Data warehousing   lecture 07
Intro to Data warehousing lecture 07AnwarrChaudary
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06AnwarrChaudary
 
Intro to Data warehousing lecture 05
Intro to Data warehousing   lecture 05Intro to Data warehousing   lecture 05
Intro to Data warehousing lecture 05AnwarrChaudary
 
Intro to Data warehousing Lecture 04
Intro to Data warehousing   Lecture 04Intro to Data warehousing   Lecture 04
Intro to Data warehousing Lecture 04AnwarrChaudary
 
Intro to Data warehousing lecture 03
Intro to Data warehousing   lecture 03Intro to Data warehousing   lecture 03
Intro to Data warehousing lecture 03AnwarrChaudary
 
Intro to Data warehousing lecture 02
Intro to Data warehousing   lecture 02Intro to Data warehousing   lecture 02
Intro to Data warehousing lecture 02AnwarrChaudary
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseAnwarrChaudary
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software EngineeringAnwarrChaudary
 

More from AnwarrChaudary (20)

Intro to Data warehousing lecture 20
Intro to Data warehousing   lecture 20Intro to Data warehousing   lecture 20
Intro to Data warehousing lecture 20
 
Intro to Data warehousing lecture 19
Intro to Data warehousing   lecture 19Intro to Data warehousing   lecture 19
Intro to Data warehousing lecture 19
 
Intro to Data warehousing lecture 18
Intro to Data warehousing   lecture 18Intro to Data warehousing   lecture 18
Intro to Data warehousing lecture 18
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17
 
Intro to Data warehousing lecture 16
Intro to Data warehousing   lecture 16Intro to Data warehousing   lecture 16
Intro to Data warehousing lecture 16
 
Intro to Data warehousing lecture 14
Intro to Data warehousing   lecture 14Intro to Data warehousing   lecture 14
Intro to Data warehousing lecture 14
 
Intro to Data warehousing lecture 13
Intro to Data warehousing   lecture 13Intro to Data warehousing   lecture 13
Intro to Data warehousing lecture 13
 
Intro to Data warehousing lecture 12
Intro to Data warehousing   lecture 12Intro to Data warehousing   lecture 12
Intro to Data warehousing lecture 12
 
Intro to Data warehousing lecture 11
Intro to Data warehousing   lecture 11Intro to Data warehousing   lecture 11
Intro to Data warehousing lecture 11
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
 
Intro to Data warehousing lecture 09
Intro to Data warehousing   lecture 09Intro to Data warehousing   lecture 09
Intro to Data warehousing lecture 09
 
Intro to Data warehousing lecture 08
Intro to Data warehousing   lecture 08Intro to Data warehousing   lecture 08
Intro to Data warehousing lecture 08
 
Intro to Data warehousing lecture 07
Intro to Data warehousing   lecture 07Intro to Data warehousing   lecture 07
Intro to Data warehousing lecture 07
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06
 
Intro to Data warehousing lecture 05
Intro to Data warehousing   lecture 05Intro to Data warehousing   lecture 05
Intro to Data warehousing lecture 05
 
Intro to Data warehousing Lecture 04
Intro to Data warehousing   Lecture 04Intro to Data warehousing   Lecture 04
Intro to Data warehousing Lecture 04
 
Intro to Data warehousing lecture 03
Intro to Data warehousing   lecture 03Intro to Data warehousing   lecture 03
Intro to Data warehousing lecture 03
 
Intro to Data warehousing lecture 02
Intro to Data warehousing   lecture 02Intro to Data warehousing   lecture 02
Intro to Data warehousing lecture 02
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 

Recently uploaded

latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answersdalebeck957
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationNeilDeclaro1
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsNbelano25
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 

Recently uploaded (20)

latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 

Intro to Data warehousing lecture 15

  • 1. - 1 Data Warehousing Case study creating a Data warehouse for super market Ch Anwar ul Hassan (Lecturer) Department of Computer Science and Software Engineering Capital University of Sciences & Technology, Islamabad Pakistan anwarchaudary@gmail.com
  • 2. - 2 Case Study- Introduction Introduction: 15 years back Information technology gave a gift to this world,-E-Commerce. Since than every small and big business has used it to improve its outreach, customer count, sales, profit and each possible aspect. But this was not sufficient. As data grew from MB to GB to PB, these smart business felt a to store this data efficiently and to utilize it for improving various aspect of business. One such domain is retail where customer are products are key aspect. Which product is needed by what type of customer and when are the key questions of retail business. If they are answered well the can take retail business to new heights. In solving these queries Data Warehouse plays an important role. It helps to analyze key aspects to improve sale of retail stores. To know what customer (customer location) buys and in which season, we need to have a look over the whole data. So first we need to collect the whole historical data in one place in a standard format. This is done by preparing data ware house. There are many software which helps in this like Teradata, Netezza, Oracle, Hadoop etc. Once the warehouse is prepared we can use this dataset in many ways to answer endless queries. In this project I have simulated the real time data warehouse preparation and answering business queries.
  • 3. - 3 Data Sources:  Data is the basic requirement of any data warehouse. Data for this data warehouse is collected from three different datasets. The first one is from a Global supermarket store, from which I took data of five different stores from different locations of USA for year 2012. Second is the revenue collection from each store in each month. Third one signifies which month fall in which season in USA.  Three of the dataset are easily coerced together as all the dataset have same month_id in each dataset which is used in sql query for lookup to populate data in fact table.
  • 4. - 4 Data Sources:  Data source 1- This dataset had been fetched from www.Kaggle.com . Kaggle is a repository of thousands of data set. This dataset contain data of supermarket of whole globe. The data used in this data warehouse is of five different state of USA which are New York, New Jersey, New Hampshire, Utah, Texas. Link of the dataset: https://kaggle2.blob.core.windows.net/datasets/1048/1903/global_superstore_20 16.xlsx.zip?sv=201 5-12- 11&sr=b&sig=V6MbJAh5QVwQC8wLLiPrsC8dKochxZ354VLclEnFuWM%3D&s e=201704-07T08%3A21%3A15Z&sp=r  Data Source 2- This dataset is a dummy dataset which is generated by mockaroo. This dataset contains the revenue of each month for each state. https://www.mockaroo.com/  Data Source 3- This is the unstructured data set which I Scraped from the site: https://www.englishclub.com/vocabulary/time-months-of-year.htm This data has been uploaded into excel which looks like as shown in Fig.3.1 which is cleaned in and made structured as shown in Fig.3.This dataset have seasons of USA.
  • 5. - 5 Have you opened the link?  Please take the data sources screenshot and share in chat …..
  • 6. - 6 Have you opened the link?  Arrange the data…
  • 7. - 7 Data Warehouse Design and Architecture: To carry out the analysis of retail store in different state of USA like how much is the revenue generation, amount of product sold in what month and in which season Kimball’s approach is used to build this Data Warehouse. Design Tool for this Data Warehouse:-  Sql Server Management Studio  Sql Server Integration Services  Sql Server Analysis Services
  • 8. - 8 Data Warehouse Design and Architecture: I have followed the Kimball’s architecture which consist of the following procedures :- Identification of the Process of Business:- We need to define the main process of business like acquiring customer, acquiring the products, then sale process. We also need to understand at what level sales data is summarized. Whether it is daily, weekly or monthly level. This step helps in determining the entities and their relationship as per business requirement. Later on these entities becomes the dimensions of the business. The most important entities are Cusotmer, Product, Location, and time.
  • 9. - 9 Data Warehouse Design and Architecture: • Defining the Grain:- Grains mean at what depth we need to store the data for these dimension. It defined the granularity of the system. In this project we are going to store sales of the product at month level. • Defining the Dimensions :- Once entities and grains are decided we can decide the dimension. This dataset contains five dimensions
  • 10. - 10 Data Warehouse Design and Architecture: • Deciding the fact of the Data Warehouse:-Fact table defines the measurable data we are going to store for the dimesions. It is the pivot of star schema which contain all the primary keys of dimensions and the measurable quantities which are used to carry out business queries. This fact data is designed in such a manner that it helps in identifying which is our regular customer, how to improve retail business as each season have variation in selling of product, how much revenue is generated in which state and last but not least which is the highest selling product.
  • 11. - 11 Advantage of Kimball’s Model: Kimball model has slight different approach to build data warehouse as it follows bottom up approach which help in merging small datasets. • Performane of Kimball model is better • More focus is on Dimension which play important role for analysis • Focus of this approach is on the process of Building DW • Less time consuming in creating the DataWarehouse
  • 12. - 12 Overview of building data warehouse to carry out Business intelligence queries:- • In SSIS (Sql Server Integration Services) package ETL is done three of the datasets are in excel sheet which are extracted into the staging table. • From staging table data is populated into the Dimensions table. • with the help of lookup tool(join) data is being populated into the fact table. • Cube is deployed in SSAS (Sql Server Analysis Services) • Business queries are carried out in power BI
  • 13. - 13 Overview of building data warehouse to carry out Business intelligence queries:-
  • 14. - 14 Star Schema Star Schema: Star Schema looks like a star in which Fact Table act as a pivot as it resides in the center, while multiple Dimensions are attached to the fact table in a star like form having concepts of Foreign key. A simple Star Schema usually have one Fact Table and multiple Dimensions but a complex Star Schema can consist more than one Fact Table. Generally, Fact Tables are in 3NF. Fact Table: Fact Table consist two type of column; (i) Measure columns (ii) Foreign key column. Measure columns consist of numeric values that can be measured or count while foreign key column consist of column which act as primary key in dimension tables. Measure column can be used in form of aggregation or without aggregation for analysis of Business query.
  • 15. - 15 Star Schema • Dimension Table: Dimension table consist of Textual and descriptive values. Each dimension Table have their own primary key which is a unique table represent other column values. The surrogate column known as foreign key column in Fact Table is nothing else but they are the Primary key column of Dimension Table
  • 16. - 16 Advantage of Star Schema Star schema has various merit which prove its efficiency as well as its specialty in building a Data warehouse. • Easy to generate an ETL process • Complexity is low as table query has direct relationship • Decrease the headache of Normalizing, as data in dimension tables is stored in normal form • It is very efficient to carry out metric analysis • Each Dimension table is directly connected to Fact Table • Navigation of Data is fast as of the nature of connection of fact and dimension table.
  • 17. - 17 Design of Data Warehouse: Dim_Table: For this Retail Data warehouse five dimensions and one fact table have been created. Dim_Customer: Customer dimension consist of Customer name, Customer id, Customer key. Customer key is the primary key in this dimension. It is generated when we I create the dimension by entering query [Customer_Key] INT Identity (1,1)PK. Now the question is why I generated this, as I was already having customer_id. As the primary key should be unique, none of the value should be repeated but as the customer is repeated their id will also repeat and that won’t make the column unique,so to remove this redundancy Customer_key as the primary key of this dimension is auto generated. Customer_name contains the name of customer and customer_id column contain the id of customer. With this dimension we can analyse which one is our regular customer.
  • 18. - 18 Design of Data Warehouse: Dim_Table: Dim_Product: Product dimension has product_key as the primary key. Product_id contain id of the products. Product_name contain the name of product sold.With the help of this dimension we can analyze which is the highest selling product and which customer buys what product.
  • 19. - 19 Design of Data Warehouse: Dim_Table: Dim_Location: Location dimension contain Location_Key as primary key. State_id is the id of state. State_name contains the name of state of store location. Region name contains the region of the country. This dimension is helpful in analyzing which state or region have higest number of customer,which state got highest sale. It will also help in analyzing the revenue earned in each state or region.
  • 20. - 20 Design of Data Warehouse: Dim_Table: Dim_Source: This dimension is fetched from unstructured dataset. It contain Season_key as primary key. Se_month_id is the id of a particular month. This Dimension will help in analyzing which month shows the highest sale and which season has what highest selling product.
  • 21. - 21 Design of Data Warehouse: Dim_Table: Dim_Month: This dimension contains Month_Key as Primary Key. S_month_id contain the id of particular month. Month_name contain the month.This dimension can be used in analyzing highest sale in a state according to month or which is the highest sold product in a month.
  • 22. - 22 Design of Data Warehouse: Fact_Table: For our retail superstore we have created one fact table which is connected with each dimension table with foreign key relationship. It has three columns for measurement. • product_quantity- It contains the product of quantity sold. • total_sale- It contain the sale amount of customer visit wise. • revenue- It contain the amount of revenue generated in the store month wise.
  • 23. - 23 Star Schema of Project: Dimension tables and Fact Table is connected together using Star schema as shown in Fig 12.
  • 24. - 24 Extract Transform Load(ETL) process For Building a data warehouse the important thing is extracting data, then this data is transformed into the staging area and lastly loaded in destination area. This is known as ETL process. To carry out ETL process for SSIS toolbox is used. In ETL process data from the External source is Extracted into the staging Database. Next step is to carry Transformation stage. Loading stage is the end of ETL process in which data is loaded in fact table. At the end of ETL process data is populated in fact table as well as in dimension table as shown in Fig.6.
  • 25. - 25 Extraction: Data is extracted from external source in this phase. For this project excel sheets are the external source. Otherwise it can be any database or OLTP server. This extraction will load the data into the the staging database base, which is ole db destination as shown in Fig 14.
  • 26. - 26 Extraction: All the data is extracted into the database from these excel files. We can also see the data which comes in staging phase is stored in the database as; dbo.Main_Stage dbo.season_stage dbo.state_stage etc as shown in Fig.
  • 27. - 27 Extraction: A Truncate Query is written in staging phase so that no multiple data is generated due to multiple run as shown in Fig 16.
  • 28. - 28 Transformation: After the data is extracted from excel to staging database, next step which is done is transformation.For transformation i have used lookup tool(join) and sql query as shown in Fig.19.2 for loading the data from dimension tables.
  • 29. - 29 Transformation: we have five dimension tables in our data base and 1 fact table.  dbo.Dim_Customer  dbo.Dim_Location  dbo.Dim_Month  dbo.Dim_Product  dbo.dim_Source  dbo.Retail_Fact These dimensions are shown in Fig.17.Dimensions are one of the important factor in analyzing data. Mapping should not be mismatched as it will terminate the ETL flow
  • 30. - 30 Transformation: Dimensions are one of the important factor in analyzing data. Mapping should not be mismatched as it will terminate the ETL flow
  • 31. - 31 Loading: After populating Dimension table next step is to populate Fact table. Fact table contains all the primary key of the dimension tables and some measurables which are used for analysis purpose with some aggregation rule. Lookup tool (joins) is used to populate the dimension table and Measures in fact Table.
  • 33. - 33 Deploying the CUBE: It is the phase to carry out multidimensional representation of data with the help of cube in SSAS which is further use to analyze the data on the basis of measures which are present in fact table and the descriptive, textual data present in Dimension tables. Here, Project Cube is successfully deployed as shown in Fig.20 & Fig.21. After deploying the cube, phase of analysis and reporting start’s where Business intelligence query is carried out.
  • 35. - 35 Business Analytics Tool Used for Business Query-: Power BI Power BI is used to carry out the analysis of this Data Warehouse. For analyzing cube is imported in power BI. with the help of descriptive, textual and measurable quantity business queries have been carried out.
  • 36. - 36 Business Analytics Following business query can be analyzed with the help of our database Case Study:1  Does Seasons(summer, spring, winter, autumn) in 3 different regions of USA effect the retail store business in term of revenue collection. Case Study:2  Sales generated in different states on basis of seasons Case Study:3  Analytical Targeting of customers Case Study:4  Seasons affecting the revenue of States
  • 37. - 37 Case Study:1 Does Seasons(summer, spring, winter, autumn) in 3 different regions of USA effect the retail store business in term of revenue collection This Query touches all of the three dataset. To verify the above Query we will take revenue, season name and region name. Below Graph shows how much revenue is generated in which region and in which season.
  • 39. - 39 Case Study:1 - Analysis Analysis: From the clustered bar chart representation we can analyze that highest revenue is generated in summer season followed by autumn, then by winter and spring is responsible for least revenue in each region of USA. Graph also shows that in all the seasons store earns most of its revenue from Eastern US and Western season stood last. This graph give a quick insight to marketing and sales team that they need work on Western region to increase sales and find the reason of spring being so slow.
  • 40. - 40 Case Study:2 Sales generated in different states on basis of seasons This Query is generated from all the three dataset. To predict above query Total sale, State and Season is used. Below is the pie chart Fig.23 represent sale of different states in different season.
  • 42. - 42 Case Study: 2 - Analysis Analysis: This pie chart is used to analysis the sales of store in different state in different season. As the Fig.23 shows that sale in Texas in summer season is highest, followed by New York. The pie chart shows that New York got highest sale in autumn Season and is followed by Texas. So New York and Texas are biggest buyers in any season. While rest of states are slow in all seasons. So it seems state is very important factor in terms of sales. We need to understand the needs of Western US states which our store is not able to cater. Either we need to change the products or increase some offers or may be store manager is not very efficient. Season and State are very important factor in US. The product which is suitable for New York in Winter might not be suitable for Utah during same time. This kind of variation is needed while planning store products.
  • 43. - 43 Case Study:3 Analytical Targeting of customers To predicate the above query we need to check which customer buys maximum number of products in which season. Product quantity, Customer Name and season is used for targeting specific customers.
  • 45. - 45 Case Study: 3 - Analysis Analysis: The Donut chart Fig.24 represent customer who buys maximum number of products in four different season. Figure explains which customer bought what quantity of product in which season. According to the business point of view we can target the specific Customer and provide some more offers to improve our sales.
  • 46. - 46 Case Study:4 Seasons affecting the revenue of States This query also touches three of the dataset.To analyze the above query we used seasons, revenue, states to check the amount of revenue generated from each state in every season.
  • 48. - 48 Case Study: 4 - Analysis Analysis: The above graphical representation Fig.25 shows how much revenue is collected in each state in each season. New York have generated highest amount of revenue in each season.while New Hampshire have generated the least. In perspective of business New York and Texas revenue generation is significantly high.
  • 49. - 49 Conclusion: This data warehouse can help in depicting how we can target specific customer in which region of the country. New York and Texas have highest sale and highest revenue generation while New Hampshire have significance less than each of the other state.so to improve the sale in New Hampshire, Utah, New Jersey. Seasons also play important role in retail business as the sale in summer season is the highest of all. with the help of this Data Warehouse we can also examine which product is sold in which month so we can give some extra offers on that particular product.