SlideShare a Scribd company logo
Resolving e-commerce challenges
with probabilistic programming
Liudmyla Kyrashchuk
Magdalena Wójcik
Content of the presentation
1) What is the Bayesian approach?
2) Profits of going Bayesian
3) Recap of distributions
4) Toolbox for a Bayesian Hacker
5) Case Study #1 - Price-demand change analysis
6) Case Study #2 - Hierarchical modeling
Bayesian programming is a best
all-in-one statistical tool for hackers.
🔨
My hypothesis:
What is the Bayesian approach?
Thomas Bayes - XVIII century mathematician who
interpreted probability as the degree of belief, and
not the simple frequency of events.
Bayes’ Theorem
Posterior probability
of A given evidence B
Prior
probability
of A
Likelihood of
collecting evidence B
when A is true
Probability of
collecting B under all
circumstances
Mandatory naive example 🤒
What is the probability of Bob having a relatively rare disease,
given that he received positive result from the medical test?
Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
Mandatory naive example 🤒
What is the probability of Bob having a relatively rare disease,
given that he received positive result from the medical test?
Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
0.99
Mandatory naive example 🤒
What is the probability of Bob having a relatively rare disease,
given that he received positive result from the medical test?
Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
0.004
0.99
Mandatory naive example 🤒
What is the probability of Bob having a relatively rare disease,
given that he received positive result from the medical test?
Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
0.004
0.99
0.99×1 + 0.01×249
250
Mandatory naive example 🤒
What is the probability of Bob having a relatively rare disease,
given that he received positive result from the medical test?
Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
0.004
0.99
0.28
0.99×1 + 0.01×249
250
Bayes’ Theorem in 🔍 of Data Scientist
Posterior distribution
Our updated belief
Prior belief as a
distribution
Collected data
Constant normalization term
Profits of going Bayesian
● Instead of one value, we get a distribution of likely values.
● We get information on certainty of model output.
● More ways to compare outcomes.
● Spot for expert knowledge already incorporated in the model.
● Easy way to include external knowledge when data set is small.
It’s where we put our beliefs. So we choose the distribution wisely:
1. Empirically - we have already done some experiments and have actual data,
2. With expertise - we have expert domain knowledge on the subject,
3. With intent - we have reasons to prefer some values over the others,
4. YOLO - we have no idea and wouldn’t like to affect the outcome.
Choosing prior distribution
Uniform distribution
all possible values fall between minimum and
maximum bounds and have equal likelihood.
Conditions:
1. The minimum value is fixed.
2. The maximum value is fixed.
3. All values between the minimum and
maximum occur with equal likelihood.
Recap of distributions
Normal distribution
The most common distribution, which has
3 properties:
1. Some value (mean of the
distribution) is the most likely.
2. The uncertain variable could as
likely be above the mean as it could
be below the mean (symmetrical
about the mean).
3. The uncertain variable is more
likely to be in the vicinity of the
mean than further away.
Recap of distributions
Poisson distribution
describes the number of times an event occurs in a given
interval.
Conditions:
1. The number of possible occurrences in any
interval is unlimited.
2. The occurrences are independent. The number of
occurrences in one interval does not affect the
number of occurrences in other intervals.
3. The average number of occurrences must remain
the same from interval to interval
Recap of distributions
Gamma distribution
The gamma distribution is most often used as the
distribution of the amount of time until the rth
occurrence of an event in a Poisson process.
Conditions:
1. The number of possible occurrences in any unit
of measurement is not limited to a fixed
number.
2. The occurrences are independent.
3. The average number of occurrences must
remain the same from unit to unit.
Recap of distributions
Bayesian Hacker’s Toolbox
That’s a distribution, possibly over many
dimensions. How do we even infer that?
Bayesian Hacker’s Toolbox - sampling!
When there is no closed-form solution, we can approximate with sampling, using
technique named MCMC (Markov Chain Monte Carlo).
PyMC gives us also a set of pre-defined distributions.
Markov Chain Monte Carlo - easy explanation!
1. Generate random guesses (Monte Carlo part)
2. Generate next generation guesses based only on the guesses before that.
(Markov Chain). Fun fact: this property defines that those chains are memoryless.
3. Accept new generation guesses if they “moving in the right direction”,
otherwise reject them.
Case Studies
Case Study #1: Price-Demand change analysis
We know when we changed the price for a particular product.
What we don’t know are:
● Was this price change noticeable (enough to change the demand)?
● When the price change affected the demand change?
● Can models detect this from the data?
Price change effect on demand
Price $99.98 Price $79.98
Real-world examples are often noisy and concern thousands of products at once. Below is an example of the
noisy plot.
Price change effect on demand
Price $99.98 Price $79.98
We’ll use a simplistic example for this demonstration.
First steps: sales distribution
1. Define the distribution of the data:
We want to predict sales - what can be the distribution?
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
First steps: sales distribution
1. Define the distribution of the data:
We want to predict sales - what can be the distribution?
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Discrete
First steps: sales distribution
1. Define the distribution of the data:
We want to predict sales - what can be the distribution?
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Discrete
● More than 3 unique values
First steps: sales distribution
1. Define the distribution of the data:
We want to predict sales - what can be the distribution?
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Discrete
● More than 3 unique values
Shows number of trials
until the success is
achieved
Number of times event is
occurred during the time
interval
First steps: λ distribution
1. We choose the Poisson distribution: Poisson (λ)
2. Now we must define the distribution of the λ
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
First steps: λ distribution
1. We choose the Poisson distribution: Poisson (λ)
2. Now we must define the distribution of the λ
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Continuous
First steps: λ distribution
1. We choose the Poisson distribution: Poisson (λ)
2. Now we must define the distribution of the λ
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Continuous
● Positive
First steps: λ distribution
1. We choose the Poisson distribution: Poisson (λ)
2. Now we must define the distribution of the λ
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Continuous
● Positive
● Time related
First steps: λ distribution
1. We choose the Poisson distribution: Poisson (λ)
2. Now we must define the distribution of the λ
Normal Exponential Cauchy Zero Inflated Binomial
Uniform Gamma Geometric Flat
Binomial Beta Minimum extreme Half Flat
Poisson Lognormal Negative binomial Logistic
Weibull Student’s t Discrete uniform Negative Normal
Way of thinking
● Continuous
● Positive
● Time related
Event can occur at
any time randomly
Event is more/less
likely occur over
time
Event can occur at
any time not
necessarily
randomly
Now we build the model
1. Define & Run the model
Now we build the model
2. Check the results!
On 16th day
the demand
changed
Conclusions 🤔 ?
1. For some products the change may not be
noticeable.
2. For some products the change can be easily spotted.
Why?
How much the price
changed?
Dig deeper!
Case Study #2: Price elasticity of demand 💸
Calculate Price Elasticity of Demand
for all products in store. Except some
of them are very new and for some
price almost never changed.
Case Study #2: Hierarchical model 🏔
Hierarchical models allows you to define the parameters taking to the account
group means, using the informations about products similarity.
Group level
Product level
Shrinkage
Products are “pulled” towards the group
mean.
Why is it important?
If you have few data points or a new
product, you can leverage this facts by using
knowledge from the group mean.
source
The simple model ⛰
1. Distribution of the target - sales
(Your target is λ = 𝑤*x +𝑤)
2. Distribution on the group’s level:
a. 𝑤′1
b. 𝑤′0
3. Distributions on products level:
a. 𝑤1
b. 𝑤0
1. Poisson (λ)
2.
a. Normal ( 𝛍, 𝛔)
b. Normal ( 𝛍, 𝛔)
3.
a. Group’s 𝛍, 𝛔
b. Group’s 𝛍, 𝛔
Let’s build the model!
The simple model ⛰
Check performance ⛰
Performance - explained
Mean - average of distribution
SD - standard deviation of distribution
mc_error - standard error of posterior sample mean as estimate of theoretical expectation for given
parameter. Rule of thumb: want MC error < 1 − 5% of posterior SD
hpd_2.5 & hpd_97.5 - 2.5% and 97.5% percentiles of the posterior samples for each parameter give a 95%
posterior credible interval
N_eff - number of effective samples. Rule of thumb: want N_eff ~= number of samples
Rhat - Gelman-Rubin convergence diagnostic, Rule of thumb: want Rhat ~= 1
More complex model 🏔
Add more levels
(category, subcategory, etc)
Add more
parameters
(𝑤2
, 𝑤3
,... 𝑤n
)
If smth wrong 🌋
1. Check if distribution matches the target formula
2. Plot prior distribution, maybe your prior believes are absurd
3. Plot posterior distribution, maybe your evaluations are too vague
4. Try to redefine the model with offset
5. Check possible suggestions here
6. If nothing works, reparametrize!
Useful links 👊
Library: PyMC3
Books:
Bayesian Methods for Hackers (free)
Statistical Rethinking
Good guide to distribution description (free)
Introduction to Bayesian Monte Carlo
To follow:
Thomas Wiecki (and his great blog)
Richard McElreath (also, check out his awesome lectures on )
About us
1
2
About us
We are a Boutique Data Science consulting, specialising in leading digital transformations.
We successfully realized custom projects and won Data Science competitions for startups to
Fortune 500 companies, some of them listed below.
We organize biggest Data Science community events called Kaggle Days in
cooperation with Google-owned Kaggle. Demand for AI experts is extremely
high nowadays - LogicAI is a company which has direct access to the top 3M
data scientists in the world.
Our customers have access to 2.8mln world’s top talent
We build world’s biggest offline Data Science community
We build Data Science teams for our customers
+ =
”Great community with lots of diverse talent
and skill sets.”
“You rock. Being nice and generous is the most
important thing in community events and you
were!”
1st place
Allstate
[2016]
1st place
Mercari
[2018]
2nd place
GE
[2014]
Problem solved:
Which potential
customers are at
risk of not repaying
their loans?
3rd place
Am Express
[2015]
3rd place
Deloitte
[2015]
Problem solved:
Which of our current
customers will stay
insured with us for
an entire policy
term?
Problem solved:
What is the best
price for the
product I want to
sell?
Problem solved:
How to predict
flight delays over
The US?
Problem solved:
What will future
rental prices for
properties across
Western Australia
be?
Our experience
Some of our customers
Contact us
Contact us
Magda@logicai.io
Mila@logicai.io

More Related Content

Similar to Resolving e commerce challenges with probabilistic programming

Module Five Normal Distributions & Hypothesis TestingTop of F.docx
Module Five Normal Distributions & Hypothesis TestingTop of F.docxModule Five Normal Distributions & Hypothesis TestingTop of F.docx
Module Five Normal Distributions & Hypothesis TestingTop of F.docx
roushhsiu
 
PA_EPGDM_2_2023.pptx
PA_EPGDM_2_2023.pptxPA_EPGDM_2_2023.pptx
PA_EPGDM_2_2023.pptx
somenathtiwary
 
G4 PROBABLITY.pptx
G4 PROBABLITY.pptxG4 PROBABLITY.pptx
G4 PROBABLITY.pptx
SmitKajbaje1
 
Significance Tests
Significance TestsSignificance Tests
Significance Tests
Anthony J. Evans
 
250 words, no more than 500· Focus on what you learned that made.docx
250 words, no more than 500· Focus on what you learned that made.docx250 words, no more than 500· Focus on what you learned that made.docx
250 words, no more than 500· Focus on what you learned that made.docx
eugeniadean34240
 
RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3
DR. TIRIMBA IBRAHIM
 
Basic Statistical Concepts.pdf
Basic Statistical Concepts.pdfBasic Statistical Concepts.pdf
Basic Statistical Concepts.pdf
KwangheeJung
 
STSTISTICS AND PROBABILITY THEORY .pptx
STSTISTICS AND PROBABILITY THEORY  .pptxSTSTISTICS AND PROBABILITY THEORY  .pptx
STSTISTICS AND PROBABILITY THEORY .pptx
VenuKumar65
 
Probability introduction for non-math people
Probability introduction for non-math peopleProbability introduction for non-math people
Probability introduction for non-math people
GuangYang92
 
Bus 173_3.pptx
Bus 173_3.pptxBus 173_3.pptx
Bus 173_3.pptx
ssuserbea996
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
aneeshs28
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
bisan3
 
Quantitative Methods for Management_MBA_Bharathiar University probability dis...
Quantitative Methods for Management_MBA_Bharathiar University probability dis...Quantitative Methods for Management_MBA_Bharathiar University probability dis...
Quantitative Methods for Management_MBA_Bharathiar University probability dis...
Victor Seelan
 
Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptx
CallplanetsDeveloper
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingPenn State University
 
Sampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi JainSampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi Jain
RuchiJainRuchiJain
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability DistributionsHarish Lunani
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
smackinnon
 
Research method ch07 statistical methods 1
Research method ch07 statistical methods 1Research method ch07 statistical methods 1
Research method ch07 statistical methods 1naranbatn
 
Qm 0809
Qm 0809 Qm 0809
Qm 0809
8430025979
 

Similar to Resolving e commerce challenges with probabilistic programming (20)

Module Five Normal Distributions & Hypothesis TestingTop of F.docx
Module Five Normal Distributions & Hypothesis TestingTop of F.docxModule Five Normal Distributions & Hypothesis TestingTop of F.docx
Module Five Normal Distributions & Hypothesis TestingTop of F.docx
 
PA_EPGDM_2_2023.pptx
PA_EPGDM_2_2023.pptxPA_EPGDM_2_2023.pptx
PA_EPGDM_2_2023.pptx
 
G4 PROBABLITY.pptx
G4 PROBABLITY.pptxG4 PROBABLITY.pptx
G4 PROBABLITY.pptx
 
Significance Tests
Significance TestsSignificance Tests
Significance Tests
 
250 words, no more than 500· Focus on what you learned that made.docx
250 words, no more than 500· Focus on what you learned that made.docx250 words, no more than 500· Focus on what you learned that made.docx
250 words, no more than 500· Focus on what you learned that made.docx
 
RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3
 
Basic Statistical Concepts.pdf
Basic Statistical Concepts.pdfBasic Statistical Concepts.pdf
Basic Statistical Concepts.pdf
 
STSTISTICS AND PROBABILITY THEORY .pptx
STSTISTICS AND PROBABILITY THEORY  .pptxSTSTISTICS AND PROBABILITY THEORY  .pptx
STSTISTICS AND PROBABILITY THEORY .pptx
 
Probability introduction for non-math people
Probability introduction for non-math peopleProbability introduction for non-math people
Probability introduction for non-math people
 
Bus 173_3.pptx
Bus 173_3.pptxBus 173_3.pptx
Bus 173_3.pptx
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Quantitative Methods for Management_MBA_Bharathiar University probability dis...
Quantitative Methods for Management_MBA_Bharathiar University probability dis...Quantitative Methods for Management_MBA_Bharathiar University probability dis...
Quantitative Methods for Management_MBA_Bharathiar University probability dis...
 
Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptx
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-Making
 
Sampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi JainSampling distribution by Dr. Ruchi Jain
Sampling distribution by Dr. Ruchi Jain
 
Probability Distributions
Probability DistributionsProbability Distributions
Probability Distributions
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
 
Research method ch07 statistical methods 1
Research method ch07 statistical methods 1Research method ch07 statistical methods 1
Research method ch07 statistical methods 1
 
Qm 0809
Qm 0809 Qm 0809
Qm 0809
 

Recently uploaded

FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 

Recently uploaded (20)

FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 

Resolving e commerce challenges with probabilistic programming

  • 1. Resolving e-commerce challenges with probabilistic programming Liudmyla Kyrashchuk Magdalena Wójcik
  • 2. Content of the presentation 1) What is the Bayesian approach? 2) Profits of going Bayesian 3) Recap of distributions 4) Toolbox for a Bayesian Hacker 5) Case Study #1 - Price-demand change analysis 6) Case Study #2 - Hierarchical modeling
  • 3. Bayesian programming is a best all-in-one statistical tool for hackers. 🔨 My hypothesis:
  • 4. What is the Bayesian approach? Thomas Bayes - XVIII century mathematician who interpreted probability as the degree of belief, and not the simple frequency of events.
  • 5. Bayes’ Theorem Posterior probability of A given evidence B Prior probability of A Likelihood of collecting evidence B when A is true Probability of collecting B under all circumstances
  • 6. Mandatory naive example 🤒 What is the probability of Bob having a relatively rare disease, given that he received positive result from the medical test? Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99.
  • 7. Mandatory naive example 🤒 What is the probability of Bob having a relatively rare disease, given that he received positive result from the medical test? Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99. 0.99
  • 8. Mandatory naive example 🤒 What is the probability of Bob having a relatively rare disease, given that he received positive result from the medical test? Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99. 0.004 0.99
  • 9. Mandatory naive example 🤒 What is the probability of Bob having a relatively rare disease, given that he received positive result from the medical test? Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99. 0.004 0.99 0.99×1 + 0.01×249 250
  • 10. Mandatory naive example 🤒 What is the probability of Bob having a relatively rare disease, given that he received positive result from the medical test? Disease occurs in 1 / 250 people. Test has True Positive Rate of 0.99. 0.004 0.99 0.28 0.99×1 + 0.01×249 250
  • 11. Bayes’ Theorem in 🔍 of Data Scientist Posterior distribution Our updated belief Prior belief as a distribution Collected data Constant normalization term
  • 12. Profits of going Bayesian ● Instead of one value, we get a distribution of likely values. ● We get information on certainty of model output. ● More ways to compare outcomes. ● Spot for expert knowledge already incorporated in the model. ● Easy way to include external knowledge when data set is small.
  • 13. It’s where we put our beliefs. So we choose the distribution wisely: 1. Empirically - we have already done some experiments and have actual data, 2. With expertise - we have expert domain knowledge on the subject, 3. With intent - we have reasons to prefer some values over the others, 4. YOLO - we have no idea and wouldn’t like to affect the outcome. Choosing prior distribution
  • 14. Uniform distribution all possible values fall between minimum and maximum bounds and have equal likelihood. Conditions: 1. The minimum value is fixed. 2. The maximum value is fixed. 3. All values between the minimum and maximum occur with equal likelihood. Recap of distributions
  • 15. Normal distribution The most common distribution, which has 3 properties: 1. Some value (mean of the distribution) is the most likely. 2. The uncertain variable could as likely be above the mean as it could be below the mean (symmetrical about the mean). 3. The uncertain variable is more likely to be in the vicinity of the mean than further away. Recap of distributions
  • 16. Poisson distribution describes the number of times an event occurs in a given interval. Conditions: 1. The number of possible occurrences in any interval is unlimited. 2. The occurrences are independent. The number of occurrences in one interval does not affect the number of occurrences in other intervals. 3. The average number of occurrences must remain the same from interval to interval Recap of distributions
  • 17. Gamma distribution The gamma distribution is most often used as the distribution of the amount of time until the rth occurrence of an event in a Poisson process. Conditions: 1. The number of possible occurrences in any unit of measurement is not limited to a fixed number. 2. The occurrences are independent. 3. The average number of occurrences must remain the same from unit to unit. Recap of distributions
  • 18. Bayesian Hacker’s Toolbox That’s a distribution, possibly over many dimensions. How do we even infer that?
  • 19. Bayesian Hacker’s Toolbox - sampling! When there is no closed-form solution, we can approximate with sampling, using technique named MCMC (Markov Chain Monte Carlo). PyMC gives us also a set of pre-defined distributions.
  • 20. Markov Chain Monte Carlo - easy explanation! 1. Generate random guesses (Monte Carlo part) 2. Generate next generation guesses based only on the guesses before that. (Markov Chain). Fun fact: this property defines that those chains are memoryless. 3. Accept new generation guesses if they “moving in the right direction”, otherwise reject them.
  • 22. Case Study #1: Price-Demand change analysis We know when we changed the price for a particular product. What we don’t know are: ● Was this price change noticeable (enough to change the demand)? ● When the price change affected the demand change? ● Can models detect this from the data?
  • 23. Price change effect on demand Price $99.98 Price $79.98 Real-world examples are often noisy and concern thousands of products at once. Below is an example of the noisy plot.
  • 24. Price change effect on demand Price $99.98 Price $79.98 We’ll use a simplistic example for this demonstration.
  • 25. First steps: sales distribution 1. Define the distribution of the data: We want to predict sales - what can be the distribution? Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal
  • 26. First steps: sales distribution 1. Define the distribution of the data: We want to predict sales - what can be the distribution? Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Discrete
  • 27. First steps: sales distribution 1. Define the distribution of the data: We want to predict sales - what can be the distribution? Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Discrete ● More than 3 unique values
  • 28. First steps: sales distribution 1. Define the distribution of the data: We want to predict sales - what can be the distribution? Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Discrete ● More than 3 unique values Shows number of trials until the success is achieved Number of times event is occurred during the time interval
  • 29. First steps: λ distribution 1. We choose the Poisson distribution: Poisson (λ) 2. Now we must define the distribution of the λ Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal
  • 30. First steps: λ distribution 1. We choose the Poisson distribution: Poisson (λ) 2. Now we must define the distribution of the λ Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Continuous
  • 31. First steps: λ distribution 1. We choose the Poisson distribution: Poisson (λ) 2. Now we must define the distribution of the λ Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Continuous ● Positive
  • 32. First steps: λ distribution 1. We choose the Poisson distribution: Poisson (λ) 2. Now we must define the distribution of the λ Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Continuous ● Positive ● Time related
  • 33. First steps: λ distribution 1. We choose the Poisson distribution: Poisson (λ) 2. Now we must define the distribution of the λ Normal Exponential Cauchy Zero Inflated Binomial Uniform Gamma Geometric Flat Binomial Beta Minimum extreme Half Flat Poisson Lognormal Negative binomial Logistic Weibull Student’s t Discrete uniform Negative Normal Way of thinking ● Continuous ● Positive ● Time related Event can occur at any time randomly Event is more/less likely occur over time Event can occur at any time not necessarily randomly
  • 34. Now we build the model 1. Define & Run the model
  • 35. Now we build the model 2. Check the results! On 16th day the demand changed
  • 36. Conclusions 🤔 ? 1. For some products the change may not be noticeable. 2. For some products the change can be easily spotted. Why? How much the price changed? Dig deeper!
  • 37. Case Study #2: Price elasticity of demand 💸 Calculate Price Elasticity of Demand for all products in store. Except some of them are very new and for some price almost never changed.
  • 38. Case Study #2: Hierarchical model 🏔 Hierarchical models allows you to define the parameters taking to the account group means, using the informations about products similarity. Group level Product level
  • 39. Shrinkage Products are “pulled” towards the group mean. Why is it important? If you have few data points or a new product, you can leverage this facts by using knowledge from the group mean. source
  • 40. The simple model ⛰ 1. Distribution of the target - sales (Your target is λ = 𝑤*x +𝑤) 2. Distribution on the group’s level: a. 𝑤′1 b. 𝑤′0 3. Distributions on products level: a. 𝑤1 b. 𝑤0 1. Poisson (λ) 2. a. Normal ( 𝛍, 𝛔) b. Normal ( 𝛍, 𝛔) 3. a. Group’s 𝛍, 𝛔 b. Group’s 𝛍, 𝛔 Let’s build the model!
  • 43. Performance - explained Mean - average of distribution SD - standard deviation of distribution mc_error - standard error of posterior sample mean as estimate of theoretical expectation for given parameter. Rule of thumb: want MC error < 1 − 5% of posterior SD hpd_2.5 & hpd_97.5 - 2.5% and 97.5% percentiles of the posterior samples for each parameter give a 95% posterior credible interval N_eff - number of effective samples. Rule of thumb: want N_eff ~= number of samples Rhat - Gelman-Rubin convergence diagnostic, Rule of thumb: want Rhat ~= 1
  • 44. More complex model 🏔 Add more levels (category, subcategory, etc) Add more parameters (𝑤2 , 𝑤3 ,... 𝑤n )
  • 45. If smth wrong 🌋 1. Check if distribution matches the target formula 2. Plot prior distribution, maybe your prior believes are absurd 3. Plot posterior distribution, maybe your evaluations are too vague 4. Try to redefine the model with offset 5. Check possible suggestions here 6. If nothing works, reparametrize!
  • 46. Useful links 👊 Library: PyMC3 Books: Bayesian Methods for Hackers (free) Statistical Rethinking Good guide to distribution description (free) Introduction to Bayesian Monte Carlo To follow: Thomas Wiecki (and his great blog) Richard McElreath (also, check out his awesome lectures on )
  • 48. 1 2 About us We are a Boutique Data Science consulting, specialising in leading digital transformations. We successfully realized custom projects and won Data Science competitions for startups to Fortune 500 companies, some of them listed below. We organize biggest Data Science community events called Kaggle Days in cooperation with Google-owned Kaggle. Demand for AI experts is extremely high nowadays - LogicAI is a company which has direct access to the top 3M data scientists in the world.
  • 49. Our customers have access to 2.8mln world’s top talent We build world’s biggest offline Data Science community We build Data Science teams for our customers + = ”Great community with lots of diverse talent and skill sets.” “You rock. Being nice and generous is the most important thing in community events and you were!”
  • 50. 1st place Allstate [2016] 1st place Mercari [2018] 2nd place GE [2014] Problem solved: Which potential customers are at risk of not repaying their loans? 3rd place Am Express [2015] 3rd place Deloitte [2015] Problem solved: Which of our current customers will stay insured with us for an entire policy term? Problem solved: What is the best price for the product I want to sell? Problem solved: How to predict flight delays over The US? Problem solved: What will future rental prices for properties across Western Australia be? Our experience
  • 51. Some of our customers