Tn shaw 107 data warehousing problem set

Case Study on Data
Warehousing
Name: Tej Narayan Shaw
Roll: 107/MBA/191107
MBA-Day (BASM), 2019-2021 Batch
Paper & Code: Business Intelligence & Data Warehousing (B30)

Acknowledgment
I would like to express my heartfelt gratitude towards our
institute’s Director Shri. Dipankar Das Gupta, our Head of the
Department Prof. Dr. Tanima Ray, our esteemed faculty Professor
Subhasis Ray and all my other teachers at the Indian Institute of
Social Welfare and Business Management for their guidance and
invaluable advice throughout the course of this project. I would
also like to acknowledge my family, friends and each and every
one who contributed to this project work either directly or
indirectly.
Regards,
Tej Narayan Shaw

Index
Problem Set: Background 4
Problem Set: Issues discussed in the presentation 5
The Business Process 6
Process to solve the problem set 7
Dimensions 8, 9, 10, 11, 12, 13
Fact Tables 14, 15
Star Schema 16
Key Performance Indicators in Sales Performance 17, 18, 19, 20
Key Performance Indicators in Promotion Performance 21
Key Performance Indicators in Customer Preferences & Buying Patterns 22, 23
Conclusion 24
Bibliography 25

Problem Set: Background
● ABC Retail chain operating
in USA, Canada and Mexico.
● Foodmart has following
types of stores:
○ Supermarket
○ Small Grocery
○ Gourmet Supermarket
○ Deluxe Supermarket
○ Mid-size Grocery
● Operation started in 1990.
● Total of 24 different
outlets and 10 warehouses.

Problem Set: Issues discussed in the presentation
● Sales turnover from each type of store
● Product selling as per store
● Effect of promotion on sales of products
Customer Preferences
& Buying Patterns
● Effect of professions on the buying
patterns
Promotion
Performance
Sales Performance

The Business Process
● Indents for exhausted SKUs are raised
at store levels
● Indents goes to corresponding
warehouse in the network
● Warehouse raises PO to
suppliers/manufacturers/farmers
proportional to indents from various
stores
● Suppliers supply against PO and raises
invoice
● Payments cleared as per purchase
invoice
● Stocks transferred to stores from
warehouse as per indents
● At the end, stocks sold out to
customers through various means of
sale from the stores
● Sales are done using digital and brick
and mortar medium

Process to solve the problem set
● We are using Kimball’s approach of dimensional
modelling for our data warehouse
● The modelling consists of dimensions and facts
● Fact consists of consists of data which could be
measured or used for logical processioning, such
as clustering, classification, trending, etc.
● Dimensions consists of the metadata about the data
used in facts
● In context of relational schema, dimensions are
tables with primary keys and other attributes, and
fact(s) consists of measurable attributes and uses
the primary keys of dimensions as function keys
● Before creating the fact(s) table, dimensions must
pass through ETL process, so that data has single
version in the data warehouse
● Using Star Schema, the relationships between
dimensions and fact(s) are shown

Dimensions
CREATE TABLE Customer
(
Customer_ID INT NOT NULL,
Customer_Name CHAR(50) NOT NULL,
Customer_Location CHAR(50) NOT NULL,
Cust_Mob INT NOT NULL,
Cust_email CHAR(50) NOT NULL,
Profession_ID INT NOT NULL,
PRIMARY KEY (Customer_ID),
FOREIGN KEY (Profession_ID) REFERENCES
Customer_Profession(Cust_Profession_ID)
)
ETL queries & relational diagram from OLTP to Dimensional Model:
CREATE TABLE Customer_Profession
(
Cust_Profession_ID INT NOT NULL,
Profession_Name INT NOT NULL,
PRIMARY KEY (Cust_Profession_ID)
);

Dimensions
CREATE TABLE Warehouse
(
Warehouse_ID INT NOT
NULL,
Warehouse_address
CHAR(50) NOT NULL,
Telephone INT NOT NULL,
Email CHAR(20) NOT NULL,
PRIMARY KEY
(Warehouse_ID)
);
CREATE TABLE Promotion
(
Sales_Promotion_ID INT NOT NULL,
Promotion/Campaign_Name INT NOT
NULL,
Description INT NOT NULL,
Cost_of_Promotion INT NOT NULL,
PRIMARY KEY (Sales_Promotion_ID)
);

Dimensions
CREATE TABLE Product_Type
(
PD_Type_Id INT NOT NULL,
Product_Type_Name INT NOT NULL,
PRIMARY KEY (PD_Type_Id)
);
CREATE TABLE Date
(
Date_ DATE NOT NULL,
Day INT NOT NULL,
Month INT NOT NULL,
Year INT NOT NULL,
Quarter INT NOT NULL,
PRIMARY KEY (Date_)
);

Dimensions
CREATE TABLE Warehouse
(
Warehouse_ID INT NOT NULL,
Warehouse_address CHAR(50) NOT NULL,
PRIMARY KEY (Warehouse_ID)
);
CREATE TABLE Product_
(
Product_ID INT NOT NULL,
Product_Name INT NOT NULL,
PRIMARY KEY (Product_ID),
FOREIGN KEY (PD_Type_Id) REFERENCES
Product_Type(PD_Type_Id)
);

Dimensions
CREATE TABLE Store
(
Store_ID INT NOT NULL,
Store_Location CHAR(24) NOT NULL,
Country CHAR(20) NOT NULL,
Phone INT NOT NULL,
email_ID INT NOT NULL,
Store_Type_ID INT NOT NULL,
PRIMARY KEY (Store_ID),
FOREIGN KEY (Store_Type_ID) REFERENCES
Store_Type(Store_Type_ID)
);
CREATE TABLE Store_Type
(
Type_Name CHAR(50) NOT NULL,
PRIMARY KEY (Store_Type_ID)
);

Dimensions
CREATE TABLE Supplier
(
Vendor_ID INT NOT NULL,
Vendor_Name CHAR(50) NOT NULL,
PRIMARY KEY (Vendor_ID)
);
CREATE TABLE Consignment_Details
(
Venhicle_Details INT NOT NULL,
From INT NOT NULL,
To INT NOT NULL,
Expense INT NOT NULL,
Description VARCHAR(50) NOT NULL,
Consignment_ID INT NOT NULL,
PRIMARY KEY (Consignment_ID)
);

Fact Tables
CREATE TABLE Sales_Trans
(
Stock_before_Sale INT NOT NULL,
Stock_After_Sale INT NOT NULL,
Quantity_Enquired INT NOT NULL,
Quantity_Supplied INT NOT NULL,
MRP_per_unit INT NOT NULL,
Cost_per_unit INT NOT NULL,
Discount INT NOT NULL,
Revenue INT NOT NULL,
Expense INT NOT NULL,
Income INT NOT NULL,
Store_Txn_No. INT NOT NULL,
Customer_ID INT NOT NULL,
Profession_ID INT NOT NULL,
Sales_Promotion_ID INT NOT NULL,
Unit_Annotation CHAR(10) NOT NULL,
PRIMARY KEY (Store_Txn_No.),
FOREIGN KEY (Warehouse_ID) REFERENCES Warehouse(Warehouse_ID),
FOREIGN KEY (Customer_ID) REFERENCES Customer(Customer_ID),
FOREIGN KEY (Profession_ID) REFERENCES Customer_Profession(Cust_Profession_ID),
FOREIGN KEY (Date_) REFERENCES Date(Date_),
FOREIGN KEY (Sales_Promotion_ID) REFERENCES Promotion(Sales_Promotion_ID),
FOREIGN KEY (PD_Type_Id) REFERENCES Product_Type(PD_Type_Id),
FOREIGN KEY (Product_ID) REFERENCES Product_(Product_ID),
FOREIGN KEY (Unit_Annotation) REFERENCES Scaling_Unit_of_Material(Unit_Annotation),
FOREIGN KEY (Store_Type_ID) REFERENCES Store_Type(Store_Type_ID),
FOREIGN KEY (Store_ID) REFERENCES Store(Store_ID),
FOREIGN KEY (Consignment_ID) REFERENCES Consignment_Details(Consignment_ID)
);

Fact Tables
CREATE TABLE Procurement_Trans
(
WH_Txn_No. INT NOT NULL,
In_stock INT NOT NULL,
Out_Stock INT NOT NULL,
In_Stock_Value INT NOT NULL,
Out_Stock_Value INT NOT NULL,
Unit_Annotation CHAR(10) NOT NULL,
Vendor_ID INT NOT NULL,
PRIMARY KEY (WH_Txn_No.),
FOREIGN KEY (Warehouse_ID) REFERENCES Warehouse(Warehouse_ID),
FOREIGN KEY (Date_) REFERENCES Date(Date_),
FOREIGN KEY (PD_Type_Id) REFERENCES Product_Type(PD_Type_Id),
FOREIGN KEY (Product_ID) REFERENCES Product_(Product_ID),
FOREIGN KEY (Unit_Annotation) REFERENCES Scaling_Unit_of_Material(Unit_Annotation),
FOREIGN KEY (Store_ID) REFERENCES Store(Store_ID),
FOREIGN KEY (Vendor_ID) REFERENCES Supplier(Vendor_ID),
FOREIGN KEY (Consignment_ID) REFERENCES Consignment_Details(Consignment_ID)
);

Key Performance Indicators in Sales Performance
● Sales turnover from each type of store:
Flowchart and illustrative output to achieve the above KPI
Store Type
Revenue (in $
millions)
Supermarket 76
Small Grocery 44
Gourmet Supermarket 60
Deluxe Supermarket 34
Mid-Size Grocery 88
Data
Visualization
of Illustration

● Sales turnover from each type of store:
Key aspects of the KPI:
● Cause and effect analysis of the best and
worst performing store type
● Areas of study could be:
○ Location of stores
○ Mode of payments
○ Forecast of goods in demand
○ Staff behaviour
○ Competitor analysis in the particular
type of store
○ Discounts
● Total number of stores in each store type vs
revenue, i.e, revenue of sales per store in a
particular store type
● Total stock capacity of various stock types
● Average proximity of store location from
warehouse in each store type

● Product selling as per store:
Flowchart and illustrative output to achieve the above KPI
Data Visualization of Illustration
Store ID
Revenue (in
$ '000)
1 2
2 7
3 9
4 3
5 7
6 7
7 7
8 6
9 4
10 8
11 2
12 1
13 7
14 6
15 2
16 2
17 8
18 1
19 7
20 8
21 7
22 2
23 1
24 3

● Product selling as per store:
● Results hypothesis testing of the chart in
previous slide, could be used to identify
stores with high and low demand of particular
product(s)
● Comparison of different products of the same
product type could be done to map product
distribution
● Issues related to low stock-turnovers could be
resolved
● Efficient shelf management in small grocery
category or smaller outlets could be achieved
● Helps to study the customer behavior
hierarchy for various stores
● Helps in bulk procurements of goods by the
warehouses

Key Performance Indicators in Promotion Performance
● Effect of promotion on sales of products: Flowchart, illustrative example and inference
● Impact of promotion on the sales could be compared
● Best promotion strategy could be deduced by comparing sales of
products after various types of promotions in different platforms
● Trending methods of promotions could be implemented across
product range for various locations
● Further,relevance of various kind of promotions for different cohorts
of customers could be deduced
Promotion
Revenue
(in $ '000)
Newspaper 1
Television 2
Radio 2
Leafelet 3
Facebok 6
Instagram 9
Free
Samples 9

Key Performance Indicators in Customer Preferences & Buying Patterns
● Effect of professions on the buying patterns:
Flowchart and inference to achieve the above KPI Inference of the report under the KPI:
● The pie chart output would help to identify
professions of the regular customers
● Identification of contribution on the revenue by
each cohort based on customers’ profession
● Further slicing and dicing could be done to
identify store type or store preferred by each
cohort
● Slicing and dicing could be done to identify
products in various product types preferred by
each cohort
● For queue management, average invoicing value
of each cohort can be used to allocate billing
counters in the corresponding stores dominated
by each cohort
● Particular days in a week could be identified,
where most of transactions take place by specific
cohort
● Value chain for each such cohort could be further
planned

Key Performance Indicators in Customer Preferences & Buying Patterns
● Effect of professions on the buying patterns: Few more flowcharts as per the above KPI

Conclusion
● The fact table of the data warehouse should have
necessary metadata and metrics to address all the KPIs
● Data Marts are generated from the data warehouse
● KPIs could be better understood using interfaces which
can visualise the data
● ETL should should extract raw data and transform it for
loading in singular format, which is also used in the
data warehouse, i.e., if date in warehouse bears format
dd-mm-yyyy but the raw data is provided in the format
mm-dd-yyyy, then ETL tool should transform the date of
raw data into dd-mm-yyyy format

Bibliography
❏ https://www.slideshare.net/siddharthchaudhary39/data-ware
house-project-on-retail-store-86821154
❏ https://www.guru99.com/data-warehouse-architecture.html
❏ Tool used for schema: https://erdplus.com
❏ Tool used for flowchart:
https://lucid.app/documents#/dashboard
❏ Tool used for data visualisation illustrations: Google
Sheets

Tn shaw 107 data warehousing problem set

Recommended

Recommended

More Related Content

Similar to Tn shaw 107 data warehousing problem set

Similar to Tn shaw 107 data warehousing problem set (20)

Recently uploaded

Recently uploaded (20)

Tn shaw 107 data warehousing problem set