SlideShare a Scribd company logo
1 of 17
Download to read offline
Data Warehouse
On
Retail Store
By: Siddharth Chaudhary
X16137001
Msc in Data Analytics
National College of Ireland
Table of Contents
Introduction:.........................................................................................................................................2
Data Sources: .......................................................................................................................................2
Data source 1-..................................................................................................................................2
Data Source 2-.................................................................................................................................2
Data Source 3-.................................................................................................................................2
Data Warehouse Design and Architecture:...........................................................................................3
Design of Data Warehouse:..................................................................................................................6
Dim_Customer: ...............................................................................................................................6
Dim_Product: ..................................................................................................................................6
Dim_Location: ................................................................................................................................7
Dim_Source:....................................................................................................................................7
Dim_Month:....................................................................................................................................7
Fact_Table:......................................................................................................................................7
Star Schema of Project:...................................................................................................................8
Extract Transform Load(ETL) process: ...............................................................................................8
Extraction: .......................................................................................................................................9
Transformation:.............................................................................................................................10
Loading: ........................................................................................................................................ 11
Deploying the CUBE: ................................................................................................................... 11
Business Analytics .............................................................................................................................12
Case Study:1..................................................................................................................................12
Analysis: ...................................................................................................................................13
Case Study:2..................................................................................................................................13
Analysis: ...................................................................................................................................13
Case Study:3..................................................................................................................................14
Analysis: ...................................................................................................................................14
Case Study:4..................................................................................................................................14
Analysis: ...................................................................................................................................15
Conclusion: ........................................................................................................................................15
Introduction:
15 years back Information technology gave a gift to this world,-E-Commerce. Since than every small
and big business has used it to improve its outreach, customer count, sales, profit and each possible
aspect. But this was not sufficient. As data grew from MB to GB to PB, these smart business felt a to
store this data efficiently and to utilize it for improving various aspect of business.
One such domain is retail where customer are products are key aspect. Which product is needed by
what type of customer and when are the key questions of retail business. If they are answered well
the can take retail business to new heights. In solving these queries Data Warehouse plays an
important role. It helps to analyze key aspects to improve sale of retail stores. To know what customer
buys and in which season, we need to have a look over the whole data. So first we need to collect the
whole historical data in one place in a standard format. This is done by preparing data ware house.
There are many software which helps in this like Teradata, Netezza, Oracle, Hadoop etc. Once the
warehouse is prepared we can use this dataset in many ways to answer endless queries. In this project
I have simulated the real time data warehouse preparation and answering business queries.
Data Sources:
Data is the basic requirement of any data warehouse. Data for this data warehouse is collected from
three different datasets. The first one is from a Global supermarket store, from which I took data of
five different stores from different locations of USA for year 2012. Second is the revenue collection
from each store in each month. Third one signifies which month fall in which season in USA.
Three of the dataset are easily coerced together as all the dataset have same month_id in each dataset
which is used in sql query for lookup to populate data in fact table as shown in fig.12
Data source 1-
This dataset had been fetched from www.Kaggle.com . Kaggle is a repository of thousands of data
set. This dataset contain data of supermarket of whole globe. The data used in this data warehouse is
of five different state of USA which are New York, New Jersey, New Hampshire, Utah, Texas.
Link of the dataset:
https://kaggle2.blob.core.windows.net/datasets/1048/1903/global_superstore_2016.xlsx.zip?sv=201
5-12-11&sr=b&sig=V6MbJAh5QVwQC8wLLiPrsC8dKochxZ354VLclEnFuWM%3D&se=2017-
04-07T08%3A21%3A15Z&sp=r
Data Source 2-
This dataset is a dummy dataset which is generated by mockaroo. This dataset contains the revenue
of each month for each state.
Data Source 3-
This is the unstructured data set which I Scraped from the site:
https://www.englishclub.com/vocabulary/time-months-of-year.htm
This data has been uploaded into excel which looks like as shown in Fig.3.1 which is cleaned in and
made structured as shown in Fig.3.This dataset have seasons of USA.
Fig.1
Fig.2
Data Warehouse Design and Architecture:
To carry out the analysis of retail store in different state of USA like how much is the revenue
generation, amount of product sold in what month and in which season Kimball’s approach is used
to build this Data Warehouse.
Design Tool for this Data Warehouse:-
● Sql Server Management Studio
● Sql Server Integration Services
● Sql Server Analysis Services
I have followed the Kimball’s architecture which consist of the following procedures :-
• Identification of the Process of Business:- We need to define the main process of business
like acquiring customer, acquiring the products, then sale process. We also need to understand
at what level sales data is summarized. Whether it is daily, weekly or monthly level. This step
helps in determining the entities and their relationship as per business requirement. Later on
these entities becomes the dimensions of the business. The most important entities are
Cusotmer, Product, Location, and time.
• Defining the Grain:- Grains mean at what depth we need to store the data for these
dimension. It defined the granularity of the system. In this project we are going to store sales
of the product at month level.
• Defining the Dimensions :- Once entities and grains are decided we can decide the
dimension. This dataset contains five dimensions -
Dimension Name Primary Key Example
Customer Customer-Key Sam
Product Product_Key Jeans
Location Location_Key Chicago
Season season_Key Summer
Month Month_Key June
Table -1
These dimensions contain descriptive and textual data.
• Deciding the fact of the Data Warehouse:-Fact table defines the measurable data we are
going to store for the dimesions. It is the pivot of star schema which contain all the primary
keys of dimensions and the measurable quantities which are used to carry out business queries.
This fact data is designed in such a manner that it helps in identifying which is our regular
customer, how to improve retail business as each season have variation in selling of product,
how much revenue is generated in which state and last but not least which is the highest selling
product.
Advantage of Kimball’s Model: Kimball model has slight different approach to build data
warehouse as it follows bottom up approach which help in merging small datasets.
• Performane of Kimball model is better
• More focus is on Dimension which play important role for analysis
• Focus of this approach is on the process of Building DW
• Less time consuming in creating the DataWarehouse
Overview of building data warehouse to carry out Business intelligence queries:-
In SSIS package Etl is done three of the datasets are in excel sheet which are extracted into the staging
table,From staging table data is populated into the Dimensions table.with the help of lookup tool(join)
data is being populated into the fact table.Cube is deployed in SSAS.Business queries are carried out
in power BI.as shown in Fig.a
Fig.3
Star Schema: Star Schema looks like a star in which Fact Table act as a pivot as it resides in the
center, while multiple Dimensions are attached to the fact table in a star like form having concepts of
Foreign key.A simple Star Schema usually have one Fact Table and multiple Dimensions but a
complex Star Schema can consist more than one Fact Table. Generally, Fact Tables are in 3NF.
Fact Table: Fact Table consist two type of column(i) Measure columns (ii) Foreign key column.
Measure columns consist of numeric values that can be measured or count while foreign key column
consist of column which act as primary key in dimension tables. Measure column can be used in form
of aggregation or without aggregation for analysis of Business query.
Dimension Table: Dimension table consist of Textual and descriptive values. Each dimension Table
have their own primary key which is a unique table represent other column values. The surrogate
column known as foreign key column in Fact Table is nothing else but they are the Primary key
column of Dimension Table
Fig.4
Advantage of Star Schema: Star schema has various merit which prove its efficiency as well as its
specialty in building a Data warehouse.
• Easy to generate an ETL process
• Complexity is low as table query has direct relationship
• Decrease the headache of Normalizing, as data in dimension tables is stored in normal form
• It is very efficient to carry out metric analysis
• Each Dimension table is directly connected to Fact Table
• Navigation of Data is fast as of the nature of connection of fact and dimension table.
Design of Data Warehouse:
For this Retail Data warehouse five dimensions and one fact table have been created.
Dim_Customer:
Customer dimension consist of Customer name, Customer id, Customer key. Customer key is the
primary key in this dimension. It is generated when we I create the dimension by entering query
[Customer_Key] INT Identity (1,1)PK. Now the question is why I generated this, as I was already
having customer_id. As the primary key should be unique, none of the value should be repeated but
as the customer is repeated their id will also repeat and that won’t make the column unique,so to
remove this redundancy Customer_key as the primary key of this dimension is auto generated.
Customer_name contains the name of customer and customer_id column contain the id of customer.
With this dimension we can analyse which one is our regular customer.
Fig 5
Fig 6
Dim_Product:
Product dimension has product_key as the primary key. Product_id contain id of the products.
Product_name contain the name of product sold.With the help of this dimension we can analyze which
is the highest selling product and which customer buys what product.
Fig 7
Dim_Location:
Location dimension contain Location_Key as primary key. State_id is the id of state. State_name
contains the name of state of store location. Region name contains the region of the country. This
dimension is helpful in analyzing which state or region have higest number of customer,which state
got highest sale. It will also help in analyzing the revenue earned in each state or region.
Fig 8
Dim_Source:
This dimension is fetched from unstructured dataset. It contain Season_key as primary key.
Se_month_id is the id of a particular month. This Dimension will help in analyzing which month
shows the highest sale and which season has what highest selling product.
Fig 9
Dim_Month:
This dimension contains Month_Key as Primary Key. S_month_id contain the id of particular month.
Month_name contain the month.This dimension can be used in analyzing highest sale in a state
according to month or which is the highest sold product in a month.
Fig 10
Fact_Table:
For our retail superstore we have created one fact table which is connected with each dimension table
with foreign key relationship. It has three columns for measurement.
(i) product_quantity- It contains the product of quantity sold.
(ii) total_sale- It contain the sale amount of customer visit wise.
(iii) revenue- It contain the amount of revenue generated in the store month wise.
Fig 11
Star Schema of Project:
Dimension tables and Fact Table is connected together using Star schema as shown in Fig 12.
Fig.12
Extract Transform Load(ETL) process:
For Building a data warehouse the important thing is extracting data, then this data is transformed
into the staging area and lastly loaded in destination area. This is known as ETL process. To carry out
ETL process for SSIS toolbox is used. In ETL process data from the External source is Extracted into
the staging Database. Next step is to carry Transformation stage. Loading stage is the end of ETL
process in which data is loaded in fact table.At the end of ETL process data is populated in fact table
as well as in dimension table as shown in Fig.6.
Fig.13
Extraction:
Data is extracted from external source in this phase. For this project excel sheets are the external
source. Otherwise it can be any database or OLTP server. This extraction will load the data into the
the staging database base, which is ole db destination as shown in Fig 14. All the data is extracted
into the database from these excel files. We can also see the data which comes in staging phase is
stored in the database as
(i) dbo.Main_Stage
(ii) dbo.season_stage
(iii) state_stage as shown in Fig 15.
A Truncate Query is written in staging phase so that no multiple data is generated due to multiple
run as shown in Fig 16.
Fig.14
Fig.15
Fig.16
Transformation:
After the data is extracted from excel to staging database, next step which is done is
transformation.For transformation i have used lookup tool(join) and sql query as shown in Fig.19.2
for loading the data from dimension tables. we have five dimension tables in our data base and 1 fact
table.
(i) dbo.Dim_Customer
(ii) dbo.Dim_Location
(iii) dbo.Dim_Month
(iv) dbo.Dim_Product
(v) dbo.dim_Source
(vi)dbo.Retail_Fact
These dimensions are shown in Fig.17.Dimensions are one of the important factor in analyzing data.
Mapping should not be mismatched as it will terminate the ETL flow.
Fig.17
Fig.18
Loading:
After populating Dimension table next step is to populate Fact table. Fact table contains all the
primary key of the dimension tables and some measureables which are used for analysis purpose with
some aggregation rule. Lookup tool (joins) is used to populate the dimension table and Measures in
fact Table.
Fig.19.1
Fig.19.2
Deploying the CUBE:
It is the phase to carry out multidimensional representation of data with the help of cube in SSAS
which is further use to analyze the data on the basis of measures which are present in fact table and
the descriptive,textual data present in Dimension tables. Here, Project.Cube is successfully deployed
as shown in Fig.20 & Fig.21. After deploying the cube, phase of analysis and reporting start’s where
Business intelligence query is carried out.
Fig.20
Fig.21
Business Analytics
Tool Used for Business Query-: Power BI
Power BI is used to carry out the analysis of this Data Warehouse.For analyzing cube is imported in
power BI. with the help of descriptive, textual and measurable quantity business queries have been
carried out.
Following business query can be analyzed with the help of our database.
Case Study:1
Does Seasons(summer,spring,winter,autumn) in 3 different regions of USA effect the retail
store business in term of revenue collection.
This Query touches all of the three dataset. To verify the above Query we will take revenue, season
name and region name. Below Graph shows how much revenue is generated in which region and in
which season.
Fig.22
Analysis:
From the clustered bar chart representation we can analyze that highest revenue is generated in
summer season followed by autumn, then by winter and spring is responsible for least revenue in
each region of USA. Graph also shows that in all the seasons store earns most of its revenue from
Eastern US and Western season stood last. This graph give a quick insight to marketing and sales
team that they need work on Western region to increase sales and find the reason of spring being so
slow.
Case Study:2
Sales generated in different states on basis of seasons
This Query is generated from all the three dataset. To predict above query Total sale, State and Season
is used. Below is the pie chart Fig.23 represent sale of different states in different season.
Fig.23
Analysis:
This pie chart is used to analysis the sales of store in different state in different season. As the Fig.23
shows that sale in Texas in summer season is highest, followed by New York. The pie chart shows
that New York got highest sale in autumn Season and is followed by Texas. So New York and Texas
are biggest buyers in any season. While rest of states are slow in all seasons. So it seems state is very
important factor in terms of sales. We need to understand the needs of Western US states which our
store is not able to cater. Either we need to change the products or increase some offers or may be
store manager is not very efficient. Season and State are very important factor in US. The product
which is suitable for New York in Winter might not be suitable for Utah during same time. This kind
of variation is needed while planning store products.
Case Study:3
Analytical Targeting of customers
To predicate the above query we need to check which customer buys maximum number of products
in which season. Product quantity, Customer Name and season is used for targeting specific
customers.
Fig.24
Analysis:
The Donut chart Fig.24 represent customer who buys maximum number of products in four different
season. Figure explains which customer bought what quantity of product in which season. According
to the business point of view we can target the specific Customer and provide some more offers to
improve our sales.
Case Study:4
Seasons affecting the revenue of States
This query also touches three of the dataset.To analyze the above query we used seasons, revenue,
states to check the amount of revenue generated from each state in every season.
Fig.25
Analysis:
The above graphical representation Fig.25 shows how much revenue is collected in each state in each
season. New York have generated highest amount of revenue in each season.while New Hampshire
have generated the least. In perspective of business New York and Texas revenue generation is
significantly high.
Conclusion:
This data warehouse can help in depicting how we can target specific customer in which region of
the country. New York and Texas have highest sale and highest revenue generation while New
Hampshire have significance less than each of the other state.so to improve the sale in New
Hampshire, Utah, New Jersey. Seasons also play important role in retail business as the sale in
summer season is the highest of all. with the help of this Data Warehouse we can also examine which
product is sold in which month so we can give some extra offers on that particular product.

More Related Content

What's hot

Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
Gersiton Pila Challco
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
David Walker
 

What's hot (20)

Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data Warehouse Project Report
Data Warehouse Project Report Data Warehouse Project Report
Data Warehouse Project Report
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
 
Retail Data Warehouse
Retail Data WarehouseRetail Data Warehouse
Retail Data Warehouse
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
Business intelligence and data warehouses
Business intelligence and data warehousesBusiness intelligence and data warehouses
Business intelligence and data warehouses
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction Data warehousing Demo PPTS | Over View | Introduction
Data warehousing Demo PPTS | Over View | Introduction
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with Example
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Types
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Building a Big Data Pipeline
Building a Big Data PipelineBuilding a Big Data Pipeline
Building a Big Data Pipeline
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 

Similar to Data warehouse project on retail store

introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
kiran14360
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
Ashish Chandwani
 
Dw hk-white paper
Dw hk-white paperDw hk-white paper
Dw hk-white paper
july12jana
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdf
Sparkflows
 
IS 2 Long Report Pardeep kumar 1271107
IS 2  Long Report Pardeep kumar  1271107IS 2  Long Report Pardeep kumar  1271107
IS 2 Long Report Pardeep kumar 1271107
TouchPoint
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
Sarita Kataria
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
ashok kumar
 

Similar to Data warehouse project on retail store (20)

Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
 
Business analytics and data warehousing
Business analytics and data warehousingBusiness analytics and data warehousing
Business analytics and data warehousing
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAINING
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Dw hk-white paper
Dw hk-white paperDw hk-white paper
Dw hk-white paper
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Iowa liquor sales
Iowa liquor salesIowa liquor sales
Iowa liquor sales
 
SALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdfSALES_FORECASTING of sparkflows.pdf
SALES_FORECASTING of sparkflows.pdf
 
IS 2 Long Report Pardeep kumar 1271107
IS 2  Long Report Pardeep kumar  1271107IS 2  Long Report Pardeep kumar  1271107
IS 2 Long Report Pardeep kumar 1271107
 
Data warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
 
Black_Friday_Sales_Trushita
Black_Friday_Sales_TrushitaBlack_Friday_Sales_Trushita
Black_Friday_Sales_Trushita
 
Msbi by quontra us
Msbi by quontra usMsbi by quontra us
Msbi by quontra us
 
ETL QA
ETL QAETL QA
ETL QA
 
Datawarehouse Overview
Datawarehouse OverviewDatawarehouse Overview
Datawarehouse Overview
 
Data miningvs datawarehouse
Data miningvs datawarehouseData miningvs datawarehouse
Data miningvs datawarehouse
 

More from Siddharth Chaudhary

More from Siddharth Chaudhary (19)

Certificate importing data in python from relational database,xls and flat fi...
Certificate importing data in python from relational database,xls and flat fi...Certificate importing data in python from relational database,xls and flat fi...
Certificate importing data in python from relational database,xls and flat fi...
 
Certificate cleaning data in python
Certificate cleaning data in pythonCertificate cleaning data in python
Certificate cleaning data in python
 
Certificate network analysis
Certificate network analysisCertificate network analysis
Certificate network analysis
 
Certificate pandas foundation
Certificate pandas foundationCertificate pandas foundation
Certificate pandas foundation
 
Certificate Supervised learning with scikit learn
Certificate Supervised learning with scikit learnCertificate Supervised learning with scikit learn
Certificate Supervised learning with scikit learn
 
Certificate unsupervised learning in python
Certificate unsupervised learning in pythonCertificate unsupervised learning in python
Certificate unsupervised learning in python
 
Certificate cleaning data in r
Certificate cleaning data in rCertificate cleaning data in r
Certificate cleaning data in r
 
Machine learning project
Machine learning projectMachine learning project
Machine learning project
 
Certificate joining data in postgre sql course
Certificate joining data in postgre sql courseCertificate joining data in postgre sql course
Certificate joining data in postgre sql course
 
Certificate introduction to r for finance
Certificate introduction to r for financeCertificate introduction to r for finance
Certificate introduction to r for finance
 
Certificate forecsating using r
Certificate forecsating using rCertificate forecsating using r
Certificate forecsating using r
 
Certificate arima modeling with r
Certificate arima modeling with rCertificate arima modeling with r
Certificate arima modeling with r
 
Certificate introduction to r course
Certificate introduction to r courseCertificate introduction to r course
Certificate introduction to r course
 
Thesis report
Thesis reportThesis report
Thesis report
 
Project on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environmentProject on nypd accident analysis using hadoop environment
Project on nypd accident analysis using hadoop environment
 
Project on visualization
Project on visualizationProject on visualization
Project on visualization
 
Salesforce project
Salesforce projectSalesforce project
Salesforce project
 
Automated home secuirty project
Automated home secuirty projectAutomated home secuirty project
Automated home secuirty project
 
Statistics report
Statistics reportStatistics report
Statistics report
 

Recently uploaded

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Recently uploaded (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Data warehouse project on retail store

  • 1. Data Warehouse On Retail Store By: Siddharth Chaudhary X16137001 Msc in Data Analytics National College of Ireland
  • 2. Table of Contents Introduction:.........................................................................................................................................2 Data Sources: .......................................................................................................................................2 Data source 1-..................................................................................................................................2 Data Source 2-.................................................................................................................................2 Data Source 3-.................................................................................................................................2 Data Warehouse Design and Architecture:...........................................................................................3 Design of Data Warehouse:..................................................................................................................6 Dim_Customer: ...............................................................................................................................6 Dim_Product: ..................................................................................................................................6 Dim_Location: ................................................................................................................................7 Dim_Source:....................................................................................................................................7 Dim_Month:....................................................................................................................................7 Fact_Table:......................................................................................................................................7 Star Schema of Project:...................................................................................................................8 Extract Transform Load(ETL) process: ...............................................................................................8 Extraction: .......................................................................................................................................9 Transformation:.............................................................................................................................10 Loading: ........................................................................................................................................ 11 Deploying the CUBE: ................................................................................................................... 11 Business Analytics .............................................................................................................................12 Case Study:1..................................................................................................................................12 Analysis: ...................................................................................................................................13 Case Study:2..................................................................................................................................13 Analysis: ...................................................................................................................................13 Case Study:3..................................................................................................................................14 Analysis: ...................................................................................................................................14 Case Study:4..................................................................................................................................14 Analysis: ...................................................................................................................................15 Conclusion: ........................................................................................................................................15
  • 3. Introduction: 15 years back Information technology gave a gift to this world,-E-Commerce. Since than every small and big business has used it to improve its outreach, customer count, sales, profit and each possible aspect. But this was not sufficient. As data grew from MB to GB to PB, these smart business felt a to store this data efficiently and to utilize it for improving various aspect of business. One such domain is retail where customer are products are key aspect. Which product is needed by what type of customer and when are the key questions of retail business. If they are answered well the can take retail business to new heights. In solving these queries Data Warehouse plays an important role. It helps to analyze key aspects to improve sale of retail stores. To know what customer buys and in which season, we need to have a look over the whole data. So first we need to collect the whole historical data in one place in a standard format. This is done by preparing data ware house. There are many software which helps in this like Teradata, Netezza, Oracle, Hadoop etc. Once the warehouse is prepared we can use this dataset in many ways to answer endless queries. In this project I have simulated the real time data warehouse preparation and answering business queries. Data Sources: Data is the basic requirement of any data warehouse. Data for this data warehouse is collected from three different datasets. The first one is from a Global supermarket store, from which I took data of five different stores from different locations of USA for year 2012. Second is the revenue collection from each store in each month. Third one signifies which month fall in which season in USA. Three of the dataset are easily coerced together as all the dataset have same month_id in each dataset which is used in sql query for lookup to populate data in fact table as shown in fig.12 Data source 1- This dataset had been fetched from www.Kaggle.com . Kaggle is a repository of thousands of data set. This dataset contain data of supermarket of whole globe. The data used in this data warehouse is of five different state of USA which are New York, New Jersey, New Hampshire, Utah, Texas. Link of the dataset: https://kaggle2.blob.core.windows.net/datasets/1048/1903/global_superstore_2016.xlsx.zip?sv=201 5-12-11&sr=b&sig=V6MbJAh5QVwQC8wLLiPrsC8dKochxZ354VLclEnFuWM%3D&se=2017- 04-07T08%3A21%3A15Z&sp=r Data Source 2- This dataset is a dummy dataset which is generated by mockaroo. This dataset contains the revenue of each month for each state. Data Source 3- This is the unstructured data set which I Scraped from the site: https://www.englishclub.com/vocabulary/time-months-of-year.htm This data has been uploaded into excel which looks like as shown in Fig.3.1 which is cleaned in and made structured as shown in Fig.3.This dataset have seasons of USA.
  • 4. Fig.1 Fig.2 Data Warehouse Design and Architecture: To carry out the analysis of retail store in different state of USA like how much is the revenue generation, amount of product sold in what month and in which season Kimball’s approach is used to build this Data Warehouse. Design Tool for this Data Warehouse:- ● Sql Server Management Studio ● Sql Server Integration Services ● Sql Server Analysis Services I have followed the Kimball’s architecture which consist of the following procedures :- • Identification of the Process of Business:- We need to define the main process of business like acquiring customer, acquiring the products, then sale process. We also need to understand at what level sales data is summarized. Whether it is daily, weekly or monthly level. This step helps in determining the entities and their relationship as per business requirement. Later on these entities becomes the dimensions of the business. The most important entities are Cusotmer, Product, Location, and time. • Defining the Grain:- Grains mean at what depth we need to store the data for these dimension. It defined the granularity of the system. In this project we are going to store sales of the product at month level.
  • 5. • Defining the Dimensions :- Once entities and grains are decided we can decide the dimension. This dataset contains five dimensions - Dimension Name Primary Key Example Customer Customer-Key Sam Product Product_Key Jeans Location Location_Key Chicago Season season_Key Summer Month Month_Key June Table -1 These dimensions contain descriptive and textual data. • Deciding the fact of the Data Warehouse:-Fact table defines the measurable data we are going to store for the dimesions. It is the pivot of star schema which contain all the primary keys of dimensions and the measurable quantities which are used to carry out business queries. This fact data is designed in such a manner that it helps in identifying which is our regular customer, how to improve retail business as each season have variation in selling of product, how much revenue is generated in which state and last but not least which is the highest selling product. Advantage of Kimball’s Model: Kimball model has slight different approach to build data warehouse as it follows bottom up approach which help in merging small datasets. • Performane of Kimball model is better • More focus is on Dimension which play important role for analysis • Focus of this approach is on the process of Building DW • Less time consuming in creating the DataWarehouse Overview of building data warehouse to carry out Business intelligence queries:- In SSIS package Etl is done three of the datasets are in excel sheet which are extracted into the staging table,From staging table data is populated into the Dimensions table.with the help of lookup tool(join) data is being populated into the fact table.Cube is deployed in SSAS.Business queries are carried out in power BI.as shown in Fig.a
  • 6. Fig.3 Star Schema: Star Schema looks like a star in which Fact Table act as a pivot as it resides in the center, while multiple Dimensions are attached to the fact table in a star like form having concepts of Foreign key.A simple Star Schema usually have one Fact Table and multiple Dimensions but a complex Star Schema can consist more than one Fact Table. Generally, Fact Tables are in 3NF. Fact Table: Fact Table consist two type of column(i) Measure columns (ii) Foreign key column. Measure columns consist of numeric values that can be measured or count while foreign key column consist of column which act as primary key in dimension tables. Measure column can be used in form of aggregation or without aggregation for analysis of Business query. Dimension Table: Dimension table consist of Textual and descriptive values. Each dimension Table have their own primary key which is a unique table represent other column values. The surrogate column known as foreign key column in Fact Table is nothing else but they are the Primary key column of Dimension Table Fig.4
  • 7. Advantage of Star Schema: Star schema has various merit which prove its efficiency as well as its specialty in building a Data warehouse. • Easy to generate an ETL process • Complexity is low as table query has direct relationship • Decrease the headache of Normalizing, as data in dimension tables is stored in normal form • It is very efficient to carry out metric analysis • Each Dimension table is directly connected to Fact Table • Navigation of Data is fast as of the nature of connection of fact and dimension table. Design of Data Warehouse: For this Retail Data warehouse five dimensions and one fact table have been created. Dim_Customer: Customer dimension consist of Customer name, Customer id, Customer key. Customer key is the primary key in this dimension. It is generated when we I create the dimension by entering query [Customer_Key] INT Identity (1,1)PK. Now the question is why I generated this, as I was already having customer_id. As the primary key should be unique, none of the value should be repeated but as the customer is repeated their id will also repeat and that won’t make the column unique,so to remove this redundancy Customer_key as the primary key of this dimension is auto generated. Customer_name contains the name of customer and customer_id column contain the id of customer. With this dimension we can analyse which one is our regular customer. Fig 5 Fig 6 Dim_Product: Product dimension has product_key as the primary key. Product_id contain id of the products. Product_name contain the name of product sold.With the help of this dimension we can analyze which is the highest selling product and which customer buys what product.
  • 8. Fig 7 Dim_Location: Location dimension contain Location_Key as primary key. State_id is the id of state. State_name contains the name of state of store location. Region name contains the region of the country. This dimension is helpful in analyzing which state or region have higest number of customer,which state got highest sale. It will also help in analyzing the revenue earned in each state or region. Fig 8 Dim_Source: This dimension is fetched from unstructured dataset. It contain Season_key as primary key. Se_month_id is the id of a particular month. This Dimension will help in analyzing which month shows the highest sale and which season has what highest selling product. Fig 9 Dim_Month: This dimension contains Month_Key as Primary Key. S_month_id contain the id of particular month. Month_name contain the month.This dimension can be used in analyzing highest sale in a state according to month or which is the highest sold product in a month. Fig 10 Fact_Table: For our retail superstore we have created one fact table which is connected with each dimension table with foreign key relationship. It has three columns for measurement.
  • 9. (i) product_quantity- It contains the product of quantity sold. (ii) total_sale- It contain the sale amount of customer visit wise. (iii) revenue- It contain the amount of revenue generated in the store month wise. Fig 11 Star Schema of Project: Dimension tables and Fact Table is connected together using Star schema as shown in Fig 12. Fig.12 Extract Transform Load(ETL) process: For Building a data warehouse the important thing is extracting data, then this data is transformed into the staging area and lastly loaded in destination area. This is known as ETL process. To carry out ETL process for SSIS toolbox is used. In ETL process data from the External source is Extracted into the staging Database. Next step is to carry Transformation stage. Loading stage is the end of ETL
  • 10. process in which data is loaded in fact table.At the end of ETL process data is populated in fact table as well as in dimension table as shown in Fig.6. Fig.13 Extraction: Data is extracted from external source in this phase. For this project excel sheets are the external source. Otherwise it can be any database or OLTP server. This extraction will load the data into the the staging database base, which is ole db destination as shown in Fig 14. All the data is extracted into the database from these excel files. We can also see the data which comes in staging phase is stored in the database as (i) dbo.Main_Stage (ii) dbo.season_stage (iii) state_stage as shown in Fig 15. A Truncate Query is written in staging phase so that no multiple data is generated due to multiple run as shown in Fig 16. Fig.14
  • 11. Fig.15 Fig.16 Transformation: After the data is extracted from excel to staging database, next step which is done is transformation.For transformation i have used lookup tool(join) and sql query as shown in Fig.19.2 for loading the data from dimension tables. we have five dimension tables in our data base and 1 fact table. (i) dbo.Dim_Customer (ii) dbo.Dim_Location (iii) dbo.Dim_Month (iv) dbo.Dim_Product (v) dbo.dim_Source (vi)dbo.Retail_Fact These dimensions are shown in Fig.17.Dimensions are one of the important factor in analyzing data. Mapping should not be mismatched as it will terminate the ETL flow.
  • 12. Fig.17 Fig.18 Loading: After populating Dimension table next step is to populate Fact table. Fact table contains all the primary key of the dimension tables and some measureables which are used for analysis purpose with some aggregation rule. Lookup tool (joins) is used to populate the dimension table and Measures in fact Table. Fig.19.1
  • 13. Fig.19.2 Deploying the CUBE: It is the phase to carry out multidimensional representation of data with the help of cube in SSAS which is further use to analyze the data on the basis of measures which are present in fact table and the descriptive,textual data present in Dimension tables. Here, Project.Cube is successfully deployed as shown in Fig.20 & Fig.21. After deploying the cube, phase of analysis and reporting start’s where Business intelligence query is carried out. Fig.20
  • 14. Fig.21 Business Analytics Tool Used for Business Query-: Power BI Power BI is used to carry out the analysis of this Data Warehouse.For analyzing cube is imported in power BI. with the help of descriptive, textual and measurable quantity business queries have been carried out. Following business query can be analyzed with the help of our database. Case Study:1 Does Seasons(summer,spring,winter,autumn) in 3 different regions of USA effect the retail store business in term of revenue collection. This Query touches all of the three dataset. To verify the above Query we will take revenue, season name and region name. Below Graph shows how much revenue is generated in which region and in which season. Fig.22
  • 15. Analysis: From the clustered bar chart representation we can analyze that highest revenue is generated in summer season followed by autumn, then by winter and spring is responsible for least revenue in each region of USA. Graph also shows that in all the seasons store earns most of its revenue from Eastern US and Western season stood last. This graph give a quick insight to marketing and sales team that they need work on Western region to increase sales and find the reason of spring being so slow. Case Study:2 Sales generated in different states on basis of seasons This Query is generated from all the three dataset. To predict above query Total sale, State and Season is used. Below is the pie chart Fig.23 represent sale of different states in different season. Fig.23 Analysis: This pie chart is used to analysis the sales of store in different state in different season. As the Fig.23 shows that sale in Texas in summer season is highest, followed by New York. The pie chart shows that New York got highest sale in autumn Season and is followed by Texas. So New York and Texas are biggest buyers in any season. While rest of states are slow in all seasons. So it seems state is very important factor in terms of sales. We need to understand the needs of Western US states which our store is not able to cater. Either we need to change the products or increase some offers or may be store manager is not very efficient. Season and State are very important factor in US. The product which is suitable for New York in Winter might not be suitable for Utah during same time. This kind of variation is needed while planning store products.
  • 16. Case Study:3 Analytical Targeting of customers To predicate the above query we need to check which customer buys maximum number of products in which season. Product quantity, Customer Name and season is used for targeting specific customers. Fig.24 Analysis: The Donut chart Fig.24 represent customer who buys maximum number of products in four different season. Figure explains which customer bought what quantity of product in which season. According to the business point of view we can target the specific Customer and provide some more offers to improve our sales. Case Study:4 Seasons affecting the revenue of States This query also touches three of the dataset.To analyze the above query we used seasons, revenue, states to check the amount of revenue generated from each state in every season.
  • 17. Fig.25 Analysis: The above graphical representation Fig.25 shows how much revenue is collected in each state in each season. New York have generated highest amount of revenue in each season.while New Hampshire have generated the least. In perspective of business New York and Texas revenue generation is significantly high. Conclusion: This data warehouse can help in depicting how we can target specific customer in which region of the country. New York and Texas have highest sale and highest revenue generation while New Hampshire have significance less than each of the other state.so to improve the sale in New Hampshire, Utah, New Jersey. Seasons also play important role in retail business as the sale in summer season is the highest of all. with the help of this Data Warehouse we can also examine which product is sold in which month so we can give some extra offers on that particular product.