SlideShare a Scribd company logo
1 of 11
Download to read offline
Data warehouse and business intelligent
project for the analysis of
Starbucks
Student Name: Sonali Gupta
Student ID: x01527245
Course: Msc. Data Analytics
Table of Contents
INTRODUCTION ...................................................................................................................... 3
DATA SOURCES................................................................................................................... 3
TECHNOLOGY USED ..........................................................................................................4
DATA WAREHOUSE DESIGN AND ARCHIETECTURE ....................................................................4
Design of Data Warehouse ......................................................................................................6
Business Query........................................................................................................................ 9
Case Study 1............................................................................................................................. 9
Case Study2:....................................................................................................................... 10
Case Study 3:...................................................................................................................... 10
Conclusion:............................................................................................................................ 11
INTRODUCTION
Grabbing a cup of coffee in the morning is always delightful as it provides a punch to energize
our day, and when coffee comes with sense of ownership and lot of offer only names comes in
my mind is Starbucks. what makes me feel more delightful is having a cup of coffee at
Starbucks and trying every new variety of coffee with different beverages. I am big lover of
coffee and when it comes to buy one, I am always looking for Starbucks and It used to give me
feeling of joy, their way of presenting different variety coffee which is chosen around the globe
and the service they provide is applaudable.
Afterward I used to wonder how Starbucks manages its inventory and how handles their
business. This curiosity made me chose the Starbucks as the topic of my data warehouse
project. This project is working model of data warehousing for Starbucks and shows its
business intelligence capabilities.
Information related to Starbucks:
It is an American coffee company and was started Seattle, Washington in 1971. At present
CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. This is
knowledgeable Starbucks is the third largest fast food restaurant chain.
DATA SOURCES
1. Structured data
I found this data set by Kaggle website. This data contains all the details of Starbucks
worldwide location. The columns of metadata are Brand, Store Number, Store Name,
Ownership Type, Street address, City, State/Province, Country, Postcode, Phone Number,
Timezone, Longitude and Latitude. From this data I took the dataset of united states which I
used in project.
Link of the source – https://www.kaggle.com/starbucks/store-locations/data.
2. Semi - Structured data
I generate this data set using Mockaroo API. This data set all about the Starbucks sales
report and column of this data sets are year, month, revenue details, number of visitors,
food sales quality and Beverages sales quality.
Link of the source - http://my.api.mockaroo.com/
3. Unstructured data
This data means that does not have relational table. Data that have high text related data
That can be date, points, rating and comments. I generate this data set using the API of
Yelp.com. Yelp is basically used for to publish review rating of any local business (Restaurants,
Hotel). This data set all about the review rating of Starbucks store.
Link of the source - https://www.yelp.com/developers/documentation/v3/business_search
TECHNOLOGY USED
Different types of technology used in this project which shown below:
Database Management
• SQL Server Management Studio (SSMS)
• SQL Server Integration Services (SSIS)
Programming Language
• R is used for Twitter sentimental analysis and cleaning.
• SQL for dimensions table and fact table for every component.
Additional Software
• Tableau for creating graphs.
DATA WAREHOUSE DESIGN AND ARCHIETECTURE
Use of Data warehouse:
1. Data integrate from various sources in real time which is good for the business decision
so that in future user can access data and also time saving.
2. We have historic data, can integrate at one place with common keys, common formats
and common data model.
3. Improve the quality of the data and reports generate faster.
4. Business intelligence create. For ex: SSAS cubes
When we talk about designing and storage part for data warehouse as business intelligence
purpose. At that time, two methodology use that is Inmon and Kimball both approach have
their own advantage.
Kimball’s methodology uses as dimensional design approach and also known as the bottom-
up design. In this first create data marts reports then integrate and create data warehouse. So,
using this star schema and snow flake easy to create. This methodology gives business value
in short span of time. This is the reason I was decided to choose this approach.
Inmon’s methodology use in enterprise data warehouse. This approach also known as the top
down design. First create the normalised data model, then build the data marts and data required
for specifically business process.so this approach take lots of time and more ETL work
required.
For the analysis of Starbucks store in different area of USA like how much is the revenue
generate, number of visitors, maximum and minimum sales of food and Beverages in which
month and year. So here, Kimball’s approach is used to build this Data Warehouse.
These are the four steps for design of dimensional data model.
1. Select the business process.
2. Declare the Grain
3. Identify the dimensions
4. Identify the fact.
I have considered that how my data warehouse Starbucks look like and what be its performance
matrix on a high level before deciding my dimensions and facts. In this project are Starbucks
on atomic level after that I have selected 3 dimensions as per the need of filtering and grouping
the fact.
Fig.1
Star Schema:
Star Schema is the simplest form of data warehouse schema because diagram resembles as a
star. Star schema consist of facts table and dimensions table where as fact table is in centre
and dimension tables are joined with fact table. In this data is systematized in to facts and
dimensions.
Fact Table:
Fact table is the combination of Foreign key column and Measures column whereas foreign
key column behaves as primary key in dimension table and measures columns contain data
that is being analysed.
In Starbucks of data warehouse, fact table contains Store details, date, location, sale report
and yelp rating data. these all details helping to analysing Business query.
Loading in to
staging area Starbucks
DW
DW
Cube
Reports in
BI
1.Data source
Kaggle
(Structured
2.Mocakro API
4.YelpAPI
Dimension Table:
In data warehouse, dimension table used for define dimension, keys, attributes and values.
Every dimension table have own primary key which is unique table. It contains details of
each object data. Star schema of dimension and fact table is shown in below figure.
Fig.2
Benefits of Star Schema
If Star Schema is fine designed then it is easy to understand and analyse large data sets. Main
benefits are described below:
• ETL process is easy to create
• Complexity is very low because table has direct relationship
• Every dimension directly connected to fact table.
• Query Performance
• Load Performance and Administration
• Built in Referential Integrity
• Efficient Navigation through Data
Designof Data Warehouse
In this Starbucks Data Warehouse three dimensions and one fact table have created.
DimStoreDetails: Store details dimensions consist of Store_id, Store_name, Store_number,
Ownership and Yelp rating. Store_id is the primary key in this dimension.
Dimlocation: location details consist of Location_id, Latitude, Longitude, City, Country,
Postcode and Address. Location_id is the primary key in this dimension.
DimDate: Date dimensions consist of Date_id, Year and Month. Date_id is the primary key
in this dimension.
Facttable: For created the Starbucks data warehouse create one fact table which connected
with all dimension table with foreign key relationship. In these four columns for this
measurement.
1. Visitor_count – It contains the number of visitors
2. Revenue – It contain store details revenue.
3. Beverage_count –
4. Food_count-
Fig 3
Extract Transform Load (ETL) Process The main task of any data warehouse is to
rearrange, integrate and consolidate data over many systems. Basically, ETL means extract
data from different sources and then transformed in to staging stage and then load in to
destination stage. This is called ETL process. For ETL process SSIS tool is used. The first
step is extract data in to staging database then next transformation stage and last stage is
Loading stage where data is loaded in fact table. In end ETL process data is populated in fact
table along with dimensions table.
Extraction: I have extract the data from three different sources. First data set directly load in
to flat file and other two data files extract using API and storing in to csv format. Extraction
load the data in to staging stage connect with the OLEDB dimensions.
In this stage, yelp and mocaroo are the unstructured data set so using scrapping and R
language with help of API data generated.
In truncate means no multiple data set generated.
Transformation: After extracting, data is extracted and then transformed. I am used lookup,
join and SQL query for loading the data in dimension table.
These are the three dimensions:
1. dbo.DimDate
2. dbo.Dimlocation
3. dbo.DimStoreDetails
Loading: After populate the dimensions, then another step is populate fact table where fact
table includes all the primary key of the dimensions and lookup is used for populate the
dimensions table and measure in fact table.
Deploying the cube: With the help of SSAS which is basically used for analyse the data on
the basis of measure. Which is used in fact table and the textual form in dimension table.
When the cube deployed that means. We can apply the Business query in database
External Source
StagingDatabase
Dimensiontable
Fact table
Business Query
Case Study 1:
Whichcity has maximumrevenue?
ThisQuerycontainsthe store_sales_reportandStore_details.Sobelow graph represents the how
much revenue isgeneratedwithcity.
Analysis:
From thisbar chart representation, we caneasilyanalysethe maximumrevenue isgeneratedinNew
York city,thenChicago.
Case Study2:
Whichcity has maximumnumberof visitorsandBeveragessales?
This query contains Store_details and store_name. so below bubble graph represent the city
with grouping with visitors and Beverages.
Analysis:
From this bubble chart representation, we can easily analyse the city
Case Study 3:
Whichcity has highratinginthe basisof foodcount andBeverage count?
Thisdata set contain yelp_Rating,store_sales_report.Sobelow bargraphrepresentsthe scenarioof
thissituation:
Analysis:
Afteranalyze,clearlyseenthatNewYorkhashighrating on the basisof foodcount and beverages
count.
Conclusion:
Data warehouse easytohandle,analyze large amountof data.Usingthe data warehouse,we can
easilyfindthe inwhichmonthStarbuckssale highorlow,whichcityhas maximumrevenue,rating
and manymore.At final decide thatNew Yorkalwaysgetgoodratingand alwaysmaintainhigh
revenue.

More Related Content

What's hot

Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse FundamentalsRashmi Bhat
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)yesheeka
 
A glympse on the supply chain system of amazon
A glympse on the supply chain system of amazonA glympse on the supply chain system of amazon
A glympse on the supply chain system of amazonAshik S Nair
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail storeSiddharth Chaudhary
 
Data Warehouse Project Report
Data Warehouse Project Report Data Warehouse Project Report
Data Warehouse Project Report Tom Donoghue
 
MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?Mydbops
 
SQL Server Integration Services Tips & Tricks
SQL Server Integration Services Tips & TricksSQL Server Integration Services Tips & Tricks
SQL Server Integration Services Tips & TricksGuillermo Caicedo
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Final Project Report - Walmart Sales
Final Project Report - Walmart SalesFinal Project Report - Walmart Sales
Final Project Report - Walmart SalesDeepti Bahel
 
Team project - Data visualization on Olist company data
Team project - Data visualization on Olist company dataTeam project - Data visualization on Olist company data
Team project - Data visualization on Olist company dataManasa Damera
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdfMobeenMasoudi
 
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...Richard Harbridge
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 

What's hot (20)

Data Warehouse Fundamentals
Data Warehouse FundamentalsData Warehouse Fundamentals
Data Warehouse Fundamentals
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
A glympse on the supply chain system of amazon
A glympse on the supply chain system of amazonA glympse on the supply chain system of amazon
A glympse on the supply chain system of amazon
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail store
 
Data Warehouse Project Report
Data Warehouse Project Report Data Warehouse Project Report
Data Warehouse Project Report
 
MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?MongoDB sharded cluster. How to design your topology ?
MongoDB sharded cluster. How to design your topology ?
 
SQL Server Integration Services Tips & Tricks
SQL Server Integration Services Tips & TricksSQL Server Integration Services Tips & Tricks
SQL Server Integration Services Tips & Tricks
 
Star schema PPT
Star schema PPTStar schema PPT
Star schema PPT
 
Market baasket analysis
Market baasket analysisMarket baasket analysis
Market baasket analysis
 
Late Arrival Facts
Late Arrival FactsLate Arrival Facts
Late Arrival Facts
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Règles d’association
Règles d’associationRègles d’association
Règles d’association
 
Final Project Report - Walmart Sales
Final Project Report - Walmart SalesFinal Project Report - Walmart Sales
Final Project Report - Walmart Sales
 
Telecom Analytics
Telecom AnalyticsTelecom Analytics
Telecom Analytics
 
Team project - Data visualization on Olist company data
Team project - Data visualization on Olist company dataTeam project - Data visualization on Olist company data
Team project - Data visualization on Olist company data
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf
 
Data Mining
Data MiningData Mining
Data Mining
 
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...
Over 50 Adoption Activities That Have Helped Organizations Get More Out Of Of...
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 

Similar to Dwbi Project

Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project reportsonalighai
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15AnwarrChaudary
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1guest9529cb
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNabclearnn
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationKate Subramanian
 
Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxhajon27910
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptxjainyshah20
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 

Similar to Dwbi Project (20)

Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project report
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence Application
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
ETL QA
ETL QAETL QA
ETL QA
 
Dw concepts
Dw conceptsDw concepts
Dw concepts
 
Resume
ResumeResume
Resume
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Star schema
Star schemaStar schema
Star schema
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Resume
ResumeResume
Resume
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 

Recently uploaded

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 

Recently uploaded (20)

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 

Dwbi Project

  • 1. Data warehouse and business intelligent project for the analysis of Starbucks Student Name: Sonali Gupta Student ID: x01527245 Course: Msc. Data Analytics
  • 2. Table of Contents INTRODUCTION ...................................................................................................................... 3 DATA SOURCES................................................................................................................... 3 TECHNOLOGY USED ..........................................................................................................4 DATA WAREHOUSE DESIGN AND ARCHIETECTURE ....................................................................4 Design of Data Warehouse ......................................................................................................6 Business Query........................................................................................................................ 9 Case Study 1............................................................................................................................. 9 Case Study2:....................................................................................................................... 10 Case Study 3:...................................................................................................................... 10 Conclusion:............................................................................................................................ 11
  • 3. INTRODUCTION Grabbing a cup of coffee in the morning is always delightful as it provides a punch to energize our day, and when coffee comes with sense of ownership and lot of offer only names comes in my mind is Starbucks. what makes me feel more delightful is having a cup of coffee at Starbucks and trying every new variety of coffee with different beverages. I am big lover of coffee and when it comes to buy one, I am always looking for Starbucks and It used to give me feeling of joy, their way of presenting different variety coffee which is chosen around the globe and the service they provide is applaudable. Afterward I used to wonder how Starbucks manages its inventory and how handles their business. This curiosity made me chose the Starbucks as the topic of my data warehouse project. This project is working model of data warehousing for Starbucks and shows its business intelligence capabilities. Information related to Starbucks: It is an American coffee company and was started Seattle, Washington in 1971. At present CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. This is knowledgeable Starbucks is the third largest fast food restaurant chain. DATA SOURCES 1. Structured data I found this data set by Kaggle website. This data contains all the details of Starbucks worldwide location. The columns of metadata are Brand, Store Number, Store Name, Ownership Type, Street address, City, State/Province, Country, Postcode, Phone Number, Timezone, Longitude and Latitude. From this data I took the dataset of united states which I used in project. Link of the source – https://www.kaggle.com/starbucks/store-locations/data. 2. Semi - Structured data I generate this data set using Mockaroo API. This data set all about the Starbucks sales report and column of this data sets are year, month, revenue details, number of visitors, food sales quality and Beverages sales quality. Link of the source - http://my.api.mockaroo.com/ 3. Unstructured data This data means that does not have relational table. Data that have high text related data That can be date, points, rating and comments. I generate this data set using the API of Yelp.com. Yelp is basically used for to publish review rating of any local business (Restaurants, Hotel). This data set all about the review rating of Starbucks store. Link of the source - https://www.yelp.com/developers/documentation/v3/business_search
  • 4. TECHNOLOGY USED Different types of technology used in this project which shown below: Database Management • SQL Server Management Studio (SSMS) • SQL Server Integration Services (SSIS) Programming Language • R is used for Twitter sentimental analysis and cleaning. • SQL for dimensions table and fact table for every component. Additional Software • Tableau for creating graphs. DATA WAREHOUSE DESIGN AND ARCHIETECTURE Use of Data warehouse: 1. Data integrate from various sources in real time which is good for the business decision so that in future user can access data and also time saving. 2. We have historic data, can integrate at one place with common keys, common formats and common data model. 3. Improve the quality of the data and reports generate faster. 4. Business intelligence create. For ex: SSAS cubes When we talk about designing and storage part for data warehouse as business intelligence purpose. At that time, two methodology use that is Inmon and Kimball both approach have their own advantage. Kimball’s methodology uses as dimensional design approach and also known as the bottom- up design. In this first create data marts reports then integrate and create data warehouse. So, using this star schema and snow flake easy to create. This methodology gives business value in short span of time. This is the reason I was decided to choose this approach. Inmon’s methodology use in enterprise data warehouse. This approach also known as the top down design. First create the normalised data model, then build the data marts and data required for specifically business process.so this approach take lots of time and more ETL work required. For the analysis of Starbucks store in different area of USA like how much is the revenue generate, number of visitors, maximum and minimum sales of food and Beverages in which month and year. So here, Kimball’s approach is used to build this Data Warehouse. These are the four steps for design of dimensional data model. 1. Select the business process. 2. Declare the Grain 3. Identify the dimensions 4. Identify the fact.
  • 5. I have considered that how my data warehouse Starbucks look like and what be its performance matrix on a high level before deciding my dimensions and facts. In this project are Starbucks on atomic level after that I have selected 3 dimensions as per the need of filtering and grouping the fact. Fig.1 Star Schema: Star Schema is the simplest form of data warehouse schema because diagram resembles as a star. Star schema consist of facts table and dimensions table where as fact table is in centre and dimension tables are joined with fact table. In this data is systematized in to facts and dimensions. Fact Table: Fact table is the combination of Foreign key column and Measures column whereas foreign key column behaves as primary key in dimension table and measures columns contain data that is being analysed. In Starbucks of data warehouse, fact table contains Store details, date, location, sale report and yelp rating data. these all details helping to analysing Business query. Loading in to staging area Starbucks DW DW Cube Reports in BI 1.Data source Kaggle (Structured 2.Mocakro API 4.YelpAPI
  • 6. Dimension Table: In data warehouse, dimension table used for define dimension, keys, attributes and values. Every dimension table have own primary key which is unique table. It contains details of each object data. Star schema of dimension and fact table is shown in below figure. Fig.2 Benefits of Star Schema If Star Schema is fine designed then it is easy to understand and analyse large data sets. Main benefits are described below: • ETL process is easy to create • Complexity is very low because table has direct relationship • Every dimension directly connected to fact table. • Query Performance • Load Performance and Administration • Built in Referential Integrity • Efficient Navigation through Data Designof Data Warehouse In this Starbucks Data Warehouse three dimensions and one fact table have created. DimStoreDetails: Store details dimensions consist of Store_id, Store_name, Store_number, Ownership and Yelp rating. Store_id is the primary key in this dimension. Dimlocation: location details consist of Location_id, Latitude, Longitude, City, Country, Postcode and Address. Location_id is the primary key in this dimension.
  • 7. DimDate: Date dimensions consist of Date_id, Year and Month. Date_id is the primary key in this dimension. Facttable: For created the Starbucks data warehouse create one fact table which connected with all dimension table with foreign key relationship. In these four columns for this measurement. 1. Visitor_count – It contains the number of visitors 2. Revenue – It contain store details revenue. 3. Beverage_count – 4. Food_count- Fig 3 Extract Transform Load (ETL) Process The main task of any data warehouse is to rearrange, integrate and consolidate data over many systems. Basically, ETL means extract data from different sources and then transformed in to staging stage and then load in to destination stage. This is called ETL process. For ETL process SSIS tool is used. The first step is extract data in to staging database then next transformation stage and last stage is Loading stage where data is loaded in fact table. In end ETL process data is populated in fact table along with dimensions table.
  • 8. Extraction: I have extract the data from three different sources. First data set directly load in to flat file and other two data files extract using API and storing in to csv format. Extraction load the data in to staging stage connect with the OLEDB dimensions. In this stage, yelp and mocaroo are the unstructured data set so using scrapping and R language with help of API data generated. In truncate means no multiple data set generated. Transformation: After extracting, data is extracted and then transformed. I am used lookup, join and SQL query for loading the data in dimension table. These are the three dimensions: 1. dbo.DimDate 2. dbo.Dimlocation 3. dbo.DimStoreDetails Loading: After populate the dimensions, then another step is populate fact table where fact table includes all the primary key of the dimensions and lookup is used for populate the dimensions table and measure in fact table. Deploying the cube: With the help of SSAS which is basically used for analyse the data on the basis of measure. Which is used in fact table and the textual form in dimension table. When the cube deployed that means. We can apply the Business query in database External Source StagingDatabase Dimensiontable Fact table
  • 9. Business Query Case Study 1: Whichcity has maximumrevenue? ThisQuerycontainsthe store_sales_reportandStore_details.Sobelow graph represents the how much revenue isgeneratedwithcity. Analysis: From thisbar chart representation, we caneasilyanalysethe maximumrevenue isgeneratedinNew York city,thenChicago.
  • 10. Case Study2: Whichcity has maximumnumberof visitorsandBeveragessales? This query contains Store_details and store_name. so below bubble graph represent the city with grouping with visitors and Beverages. Analysis: From this bubble chart representation, we can easily analyse the city Case Study 3: Whichcity has highratinginthe basisof foodcount andBeverage count? Thisdata set contain yelp_Rating,store_sales_report.Sobelow bargraphrepresentsthe scenarioof thissituation:
  • 11. Analysis: Afteranalyze,clearlyseenthatNewYorkhashighrating on the basisof foodcount and beverages count. Conclusion: Data warehouse easytohandle,analyze large amountof data.Usingthe data warehouse,we can easilyfindthe inwhichmonthStarbuckssale highorlow,whichcityhas maximumrevenue,rating and manymore.At final decide thatNew Yorkalwaysgetgoodratingand alwaysmaintainhigh revenue.