SlideShare a Scribd company logo
1 of 33
BLACK FRIDAY SALES
BY: Soumit Kar
About
► Black Friday sales dataset
► By Retail Stores
► 550,068 observations
► Contains:
► Numerical Variables
► Categorical Variables
► Some missing values
Process
► Downloaded data
► Data Gathering
► Exploring Data
► Cleaning Data
► Final Formatting of Data
R Packages Used
► UsingR : for Introductory Statistics
► Sampling : Functions for drawing and calibrating samples
► Stringr:
There are four main families of functions in stringr:
o Character manipulation: these functions allow you to manipulate individual
characters within the strings in character vectors.
o Whitespace tools to add, remove, and manipulate whitespace.
o Locale sensitive operations whose operations will vary from locale to locale.
o Pattern matching functions. These recognise four engines of pattern
description. The most common is regular expressions, but there are three other
tools.
► Tidyverse :
The 'tidyverse' is a set of packages that work in harmony because they share common
data representations and 'API' design.
Using it for TIBBLE in project
► stats
For R statistical functions
► prob
A framework for performing elementary probability calculations on finite sample
spaces, which may be represented by data frames or lists.
► dbplyr
This implements the data table back-end for 'dplyr' so that you can seamlessly use data
table and 'dplyr' together.
► dtplyr
This implements the data table back-end for 'dplyr' so that you can seamlessly use data
table and 'dplyr' together.
Attributes in Data
► User_ID (Numerical Variable)
► Product_ID (Categorical Variable)
► Gender (Categorical Variable)
► Age (Categorical Variable, because it is in ranges)
► Occupation (Numerical Variable)
► City_Category (Categorical Variable)
► Stay_In_Current_City_Years (Numerical Variable)
► Marital_Status (Numerical Variable)
► Product_Category_1 (Numerical Variable)
► Product_Category_2 (Numerical Variable)
► Product_Category_3 (Numerical Variable)
► Purchase amount in dollars (Numerical Variable)
Exploring Attributes
► User Id: Not Unique, maps person to the particular purchase
► Product Id: Not Unique, tells how many purchases are made for a product
► Gender: Have only two variables: F M
► Age: It is divided into 7 ranges, Here Age is Categorical Variable
► Occupation: There are 21 different occupation ranging from 0-21
► City Category: Cities in which customers have lived is categorized into three categories: A, B, C
► Year.. : People have lived in the current city for 0-5 years. Here 5 could mean atleast 5 years
► Marital Status: People have their marriage status marked as either 0 or 1
► Product Category 1: Ranges form 1-18
► Product Category 2: Ranges form 2-18
► Product Category 3: Ranges form 3-18
► Purchase: It is the amount people spent in $ for purchases. Not unique.
Analysis..
Power BI Dashboard
Power bi chart
Slicer : Product id, user id, gender, marital status
Score Card : Total revenue, unit sold, city
Chart:
Purchase by gender and marital status (donut chart)
Product category wise purchase (matrix table)
Purchase by city category (tree map)
Purchase by age distribution (barplot)
Purchase by occupation (funnel chart)
Gender
● We can conclude that Male(75%) shop more than
Female(25%) by the pie chart.
● People within range of 26-35 shopped most.
● While people in age-range 0-17 or 55+ shopped least and
almost none compared to 26-35.
● Also, overall people within age range 18-45 are the group
which makes maximum population of shopping.
Analyse “Purchase” : Barplot
● Average dollars shoppers spent = 9334
● Hardly a shopper spend above $19000
● Shoppers mostly spent an amount of
approximately 6800 or 8700 as they got highest
peak in barplot
Analyse “Purchase” : Histograms
Break=10
We see max data lies between 5000-10000
Break = 20
We can see there are some figures which are not at all spent and good amount is spent near 15000
and b/w 5000-10000
Analyse “Purchase” : Histograms
● If a shopper is coming to black friday sale there are maximum chances, he would be spending on an
average at least $5000.
● Maximum shoppers population lie across $5000 mark.
● Coincidence & Interesting to see a 0 frequency near 10,000, and mid of 15000-20000.
● We may consider that people didn't spent in $9000 or $17000(avg of 15K & 20K) in sales.
Analyse “Purchase” : Barplot
► We can consider an average shopper will spend
$5866-$12073 in black friday sales
MULTIVARIATE DATA
► Gender
► Product Category 1
► Product Category 2
► Product Category 3
MULTIVARIATE DATA
● Overall there are more male shoppers
● Product Category 2 being sold most
● Product category 3 sales are almost half of product category 2 in case of female shoppers
MULTIVARIATE DATA : Rescaled (values
in Millions)
Overall
Gender
%tages
Product
Category
%tages
Each gender have almost same contribution in every category
Analyse : “Years in Current City”
Geometric Distribution
Probability that the person I picked have stayed 5 years in current city
Central Limit Theorem
► The mean of the sample mean distribution is equal to the mean of the parent
data.
► The higher the sample size, the narrower the spread of the sample means.
► Sample Sizes : 0.5% 1% 5% 30% 75% of total purchases set
Original
=
18151
0.5% =
91
1% =
181
5% =
905
30% =
5432
75% =
13579
Average
of
samples
Mean 9.27 8.26 9.72 9.34 9.24 9.29 10.74
Std Dev. 5.03 4.55 5.53 4.87 5.11 5.11
Central Limit Theorem
Simple Random Sampling : With
Replacement
It is a method of selection of n units out of the N units one by one such that at each stage of selection, each unit has an
equal chance of being selected, i.e., 1/ .
Simple Random Sampling : Without
Replacement
It is a method of selection of n units out of the N units one by one such that at any stage of selection, any one of
the remaining units have the same chance of being selected, i.e. 1/ . N
Systematic Sampling
Systematic sampling is a probability sampling method in which a random sample, with a fixed
periodic interval, is selected from a larger population.
Systematic Sampling : Equal Probability
Systematic Sampling : Unequal Probability
Stratified Sampling : Sample Size =10
E.g.
Suppose Sample size 50, population 840
and grouped according to gender
Population
Strata
No of
students
No of sample
Male 340 20
Female 500 30
Total 840 50
Cluster Sampling
Population is break into small groups or clusters, then some of the clusters are randomly selected
Sampling
► Therefore maximum change in original vs sample in:
► Systematic Sampling : Equal Probability
► Stratified Sampling : Sample Size =10
► Almost similar interpretation
► Systematic Sampling : Unequal Probability
Other Observations
(Used String/Tibble)
► Average year a person live in following city:
► A : 2.21
► B : 2.17
► C : 2.2
► Average purchase in each city as per number of year :
A B C
Simple ML prediction
Feature Engineering :
● Change categorical to numeric ( gender, marita status, city category)
● In current years change 4+ to 6
● Change bin to int in age column
ML model:
I applied only two ml models here
● Linear regression rmse 4694.309
● Decision tree rmse 3099.602
Conclusion
► Number of Male Shoppers > Female Shoppers
► Products in Product Category 2 sold most
► People generally spent over $5000 in sales
► People in age range 26-35 purchase most
► There are highest average sales in City Category ‘C’ as compared to other
► Unequal Probability sampling technique could be used over this dataset for best results
Thank you!!
Questions?

More Related Content

What's hot

Segmentation, Targeting and positioning of Dell Laptops
Segmentation, Targeting and positioning of Dell Laptops Segmentation, Targeting and positioning of Dell Laptops
Segmentation, Targeting and positioning of Dell Laptops Kaavya Sampath
 
Success report on_marketing_strategy_for_nike_inc
Success report on_marketing_strategy_for_nike_incSuccess report on_marketing_strategy_for_nike_inc
Success report on_marketing_strategy_for_nike_incNeenad Mba
 
Adidas Communications Strategy
Adidas Communications StrategyAdidas Communications Strategy
Adidas Communications StrategyVasso Patrikiou
 
Big bazaar buying behaviour of customers (1)
Big bazaar   buying behaviour of customers (1)Big bazaar   buying behaviour of customers (1)
Big bazaar buying behaviour of customers (1)aquib rasheed
 
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA SUPPLY CHAIN MANAGEMENT OF LUXOTTICA
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA NEERAJ KUMAR KHANTWAL
 
Chotukool team renegade(mica ii) 02-11-12
Chotukool team renegade(mica ii) 02-11-12Chotukool team renegade(mica ii) 02-11-12
Chotukool team renegade(mica ii) 02-11-12team_renegade
 
Superstore Data Analysis using R
Superstore Data Analysis using RSuperstore Data Analysis using R
Superstore Data Analysis using RMonika Mishra
 
Sports Footwear Industry Analysis- Marketing Presentation.
Sports Footwear Industry Analysis- Marketing Presentation.Sports Footwear Industry Analysis- Marketing Presentation.
Sports Footwear Industry Analysis- Marketing Presentation.Mimansha Bahadur
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Waters chromatography division : US field sales (A)
Waters chromatography division : US field sales (A)Waters chromatography division : US field sales (A)
Waters chromatography division : US field sales (A)JONNY PAUL
 
Marketing research project on nike shoes
Marketing research project on nike shoesMarketing research project on nike shoes
Marketing research project on nike shoesRohit Kumar
 
Market Research Proposal for Lenovo
Market Research Proposal for LenovoMarket Research Proposal for Lenovo
Market Research Proposal for LenovoTIEZHENG YUAN
 
Asus sales plan
Asus sales plan  Asus sales plan
Asus sales plan Pham Khoa
 
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...RIYA JAIN
 
Research methodology for project work for undergraduate students
Research  methodology  for project work for undergraduate  studentsResearch  methodology  for project work for undergraduate  students
Research methodology for project work for undergraduate studentsDr. Sanjay Sawant Dessai
 
48866074 39511362-project-report-inventory-management
48866074 39511362-project-report-inventory-management48866074 39511362-project-report-inventory-management
48866074 39511362-project-report-inventory-managementmohit gupta
 
Project report on retail marketing in india
Project report on retail marketing in indiaProject report on retail marketing in india
Project report on retail marketing in indiaGaurav Tyagi
 

What's hot (20)

Segmentation, Targeting and positioning of Dell Laptops
Segmentation, Targeting and positioning of Dell Laptops Segmentation, Targeting and positioning of Dell Laptops
Segmentation, Targeting and positioning of Dell Laptops
 
Success report on_marketing_strategy_for_nike_inc
Success report on_marketing_strategy_for_nike_incSuccess report on_marketing_strategy_for_nike_inc
Success report on_marketing_strategy_for_nike_inc
 
Adidas Communications Strategy
Adidas Communications StrategyAdidas Communications Strategy
Adidas Communications Strategy
 
Big bazaar buying behaviour of customers (1)
Big bazaar   buying behaviour of customers (1)Big bazaar   buying behaviour of customers (1)
Big bazaar buying behaviour of customers (1)
 
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA SUPPLY CHAIN MANAGEMENT OF LUXOTTICA
SUPPLY CHAIN MANAGEMENT OF LUXOTTICA
 
Marketing case study d-mart
Marketing case study  d-martMarketing case study  d-mart
Marketing case study d-mart
 
Chotukool team renegade(mica ii) 02-11-12
Chotukool team renegade(mica ii) 02-11-12Chotukool team renegade(mica ii) 02-11-12
Chotukool team renegade(mica ii) 02-11-12
 
Superstore Data Analysis using R
Superstore Data Analysis using RSuperstore Data Analysis using R
Superstore Data Analysis using R
 
Sports Footwear Industry Analysis- Marketing Presentation.
Sports Footwear Industry Analysis- Marketing Presentation.Sports Footwear Industry Analysis- Marketing Presentation.
Sports Footwear Industry Analysis- Marketing Presentation.
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Waters chromatography division : US field sales (A)
Waters chromatography division : US field sales (A)Waters chromatography division : US field sales (A)
Waters chromatography division : US field sales (A)
 
Marketing research project on nike shoes
Marketing research project on nike shoesMarketing research project on nike shoes
Marketing research project on nike shoes
 
Market Research Proposal for Lenovo
Market Research Proposal for LenovoMarket Research Proposal for Lenovo
Market Research Proposal for Lenovo
 
Asus sales plan
Asus sales plan  Asus sales plan
Asus sales plan
 
big bazaar
big bazaarbig bazaar
big bazaar
 
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...
MARKETING STP ANALYSIS AND COMPARISION OF DIFFERENT ON STP BASIS OF DIFFERENT...
 
Research methodology for project work for undergraduate students
Research  methodology  for project work for undergraduate  studentsResearch  methodology  for project work for undergraduate  students
Research methodology for project work for undergraduate students
 
Under armour
Under armour Under armour
Under armour
 
48866074 39511362-project-report-inventory-management
48866074 39511362-project-report-inventory-management48866074 39511362-project-report-inventory-management
48866074 39511362-project-report-inventory-management
 
Project report on retail marketing in india
Project report on retail marketing in indiaProject report on retail marketing in india
Project report on retail marketing in india
 

Similar to Black Friday Sales Analytics

Lecture 4 Applied Econometrics and Economic Modeling
Lecture 4 Applied Econometrics and Economic ModelingLecture 4 Applied Econometrics and Economic Modeling
Lecture 4 Applied Econometrics and Economic Modelingstone55
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfkobra22
 
Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)YesAnalytics
 
Type of data @ Web Mining Discussion
Type of data @ Web Mining DiscussionType of data @ Web Mining Discussion
Type of data @ Web Mining DiscussionCherryBerry2
 
Type of data @ web mining discussion
Type of data @ web mining discussionType of data @ web mining discussion
Type of data @ web mining discussionCherryBerry2
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013sonu kumar
 
Lecture 01_What is Satistics.pptx
Lecture 01_What is Satistics.pptxLecture 01_What is Satistics.pptx
Lecture 01_What is Satistics.pptxFazleRabby74
 
Module 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptxModule 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptxZyrenMisaki
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to StatisticsAnjan Mahanta
 
Statistics
StatisticsStatistics
Statisticspikuoec
 
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...Smarten Augmented Analytics
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionGirish Gore
 

Similar to Black Friday Sales Analytics (20)

Lecture 4 Applied Econometrics and Economic Modeling
Lecture 4 Applied Econometrics and Economic ModelingLecture 4 Applied Econometrics and Economic Modeling
Lecture 4 Applied Econometrics and Economic Modeling
 
statistics - Populations and Samples.pdf
statistics - Populations and Samples.pdfstatistics - Populations and Samples.pdf
statistics - Populations and Samples.pdf
 
Emba502 day 2
Emba502 day 2Emba502 day 2
Emba502 day 2
 
Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)Introduction to Statistics (Part -I)
Introduction to Statistics (Part -I)
 
Type of data @ Web Mining Discussion
Type of data @ Web Mining DiscussionType of data @ Web Mining Discussion
Type of data @ Web Mining Discussion
 
Common sampling techniques
Common sampling techniquesCommon sampling techniques
Common sampling techniques
 
Type of data @ web mining discussion
Type of data @ web mining discussionType of data @ web mining discussion
Type of data @ web mining discussion
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013
 
Lecture 01_What is Satistics.pptx
Lecture 01_What is Satistics.pptxLecture 01_What is Satistics.pptx
Lecture 01_What is Satistics.pptx
 
Session02
Session02Session02
Session02
 
Module 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptxModule 1 Powerpoint 2.pptx
Module 1 Powerpoint 2.pptx
 
Samplels & Sampling Techniques
Samplels & Sampling TechniquesSamplels & Sampling Techniques
Samplels & Sampling Techniques
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Intro to Statistics.pptx
Intro to Statistics.pptxIntro to Statistics.pptx
Intro to Statistics.pptx
 
Statistics
StatisticsStatistics
Statistics
 
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
What is Descriptive Statistics and How Do You Choose the Right One for Enterp...
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
Statistics with R
Statistics with R Statistics with R
Statistics with R
 
Statistics.pdf
Statistics.pdfStatistics.pdf
Statistics.pdf
 
Sampaling
SampalingSampaling
Sampaling
 

Recently uploaded

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Black Friday Sales Analytics

  • 2. About ► Black Friday sales dataset ► By Retail Stores ► 550,068 observations ► Contains: ► Numerical Variables ► Categorical Variables ► Some missing values
  • 3. Process ► Downloaded data ► Data Gathering ► Exploring Data ► Cleaning Data ► Final Formatting of Data
  • 4. R Packages Used ► UsingR : for Introductory Statistics ► Sampling : Functions for drawing and calibrating samples ► Stringr: There are four main families of functions in stringr: o Character manipulation: these functions allow you to manipulate individual characters within the strings in character vectors. o Whitespace tools to add, remove, and manipulate whitespace. o Locale sensitive operations whose operations will vary from locale to locale. o Pattern matching functions. These recognise four engines of pattern description. The most common is regular expressions, but there are three other tools.
  • 5. ► Tidyverse : The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. Using it for TIBBLE in project ► stats For R statistical functions ► prob A framework for performing elementary probability calculations on finite sample spaces, which may be represented by data frames or lists. ► dbplyr This implements the data table back-end for 'dplyr' so that you can seamlessly use data table and 'dplyr' together. ► dtplyr This implements the data table back-end for 'dplyr' so that you can seamlessly use data table and 'dplyr' together.
  • 6. Attributes in Data ► User_ID (Numerical Variable) ► Product_ID (Categorical Variable) ► Gender (Categorical Variable) ► Age (Categorical Variable, because it is in ranges) ► Occupation (Numerical Variable) ► City_Category (Categorical Variable) ► Stay_In_Current_City_Years (Numerical Variable) ► Marital_Status (Numerical Variable) ► Product_Category_1 (Numerical Variable) ► Product_Category_2 (Numerical Variable) ► Product_Category_3 (Numerical Variable) ► Purchase amount in dollars (Numerical Variable)
  • 7. Exploring Attributes ► User Id: Not Unique, maps person to the particular purchase ► Product Id: Not Unique, tells how many purchases are made for a product ► Gender: Have only two variables: F M ► Age: It is divided into 7 ranges, Here Age is Categorical Variable ► Occupation: There are 21 different occupation ranging from 0-21 ► City Category: Cities in which customers have lived is categorized into three categories: A, B, C ► Year.. : People have lived in the current city for 0-5 years. Here 5 could mean atleast 5 years ► Marital Status: People have their marriage status marked as either 0 or 1 ► Product Category 1: Ranges form 1-18 ► Product Category 2: Ranges form 2-18 ► Product Category 3: Ranges form 3-18 ► Purchase: It is the amount people spent in $ for purchases. Not unique.
  • 10. Power bi chart Slicer : Product id, user id, gender, marital status Score Card : Total revenue, unit sold, city Chart: Purchase by gender and marital status (donut chart) Product category wise purchase (matrix table) Purchase by city category (tree map) Purchase by age distribution (barplot) Purchase by occupation (funnel chart)
  • 11. Gender ● We can conclude that Male(75%) shop more than Female(25%) by the pie chart. ● People within range of 26-35 shopped most. ● While people in age-range 0-17 or 55+ shopped least and almost none compared to 26-35. ● Also, overall people within age range 18-45 are the group which makes maximum population of shopping.
  • 12. Analyse “Purchase” : Barplot ● Average dollars shoppers spent = 9334 ● Hardly a shopper spend above $19000 ● Shoppers mostly spent an amount of approximately 6800 or 8700 as they got highest peak in barplot
  • 13. Analyse “Purchase” : Histograms Break=10 We see max data lies between 5000-10000 Break = 20 We can see there are some figures which are not at all spent and good amount is spent near 15000 and b/w 5000-10000
  • 14. Analyse “Purchase” : Histograms ● If a shopper is coming to black friday sale there are maximum chances, he would be spending on an average at least $5000. ● Maximum shoppers population lie across $5000 mark. ● Coincidence & Interesting to see a 0 frequency near 10,000, and mid of 15000-20000. ● We may consider that people didn't spent in $9000 or $17000(avg of 15K & 20K) in sales.
  • 15. Analyse “Purchase” : Barplot ► We can consider an average shopper will spend $5866-$12073 in black friday sales
  • 16. MULTIVARIATE DATA ► Gender ► Product Category 1 ► Product Category 2 ► Product Category 3
  • 17. MULTIVARIATE DATA ● Overall there are more male shoppers ● Product Category 2 being sold most ● Product category 3 sales are almost half of product category 2 in case of female shoppers
  • 18. MULTIVARIATE DATA : Rescaled (values in Millions) Overall Gender %tages Product Category %tages Each gender have almost same contribution in every category
  • 19. Analyse : “Years in Current City” Geometric Distribution Probability that the person I picked have stayed 5 years in current city
  • 20. Central Limit Theorem ► The mean of the sample mean distribution is equal to the mean of the parent data. ► The higher the sample size, the narrower the spread of the sample means. ► Sample Sizes : 0.5% 1% 5% 30% 75% of total purchases set Original = 18151 0.5% = 91 1% = 181 5% = 905 30% = 5432 75% = 13579 Average of samples Mean 9.27 8.26 9.72 9.34 9.24 9.29 10.74 Std Dev. 5.03 4.55 5.53 4.87 5.11 5.11
  • 22. Simple Random Sampling : With Replacement It is a method of selection of n units out of the N units one by one such that at each stage of selection, each unit has an equal chance of being selected, i.e., 1/ .
  • 23. Simple Random Sampling : Without Replacement It is a method of selection of n units out of the N units one by one such that at any stage of selection, any one of the remaining units have the same chance of being selected, i.e. 1/ . N
  • 24. Systematic Sampling Systematic sampling is a probability sampling method in which a random sample, with a fixed periodic interval, is selected from a larger population.
  • 25. Systematic Sampling : Equal Probability
  • 26. Systematic Sampling : Unequal Probability
  • 27. Stratified Sampling : Sample Size =10 E.g. Suppose Sample size 50, population 840 and grouped according to gender Population Strata No of students No of sample Male 340 20 Female 500 30 Total 840 50
  • 28. Cluster Sampling Population is break into small groups or clusters, then some of the clusters are randomly selected
  • 29. Sampling ► Therefore maximum change in original vs sample in: ► Systematic Sampling : Equal Probability ► Stratified Sampling : Sample Size =10 ► Almost similar interpretation ► Systematic Sampling : Unequal Probability
  • 30. Other Observations (Used String/Tibble) ► Average year a person live in following city: ► A : 2.21 ► B : 2.17 ► C : 2.2 ► Average purchase in each city as per number of year : A B C
  • 31. Simple ML prediction Feature Engineering : ● Change categorical to numeric ( gender, marita status, city category) ● In current years change 4+ to 6 ● Change bin to int in age column ML model: I applied only two ml models here ● Linear regression rmse 4694.309 ● Decision tree rmse 3099.602
  • 32. Conclusion ► Number of Male Shoppers > Female Shoppers ► Products in Product Category 2 sold most ► People generally spent over $5000 in sales ► People in age range 26-35 purchase most ► There are highest average sales in City Category ‘C’ as compared to other ► Unequal Probability sampling technique could be used over this dataset for best results