SlideShare a Scribd company logo
Modelling for Decisions
Using Monte Carlo simulation, Bayesian inference and a
lot of common sense
A quick introduction
Photo credits at www.coppelia.
io/photo-credits/
Who is this person?
Simon Raper
Founder of data sciences
service company called
COPPELIA
Started coding
when I was 8 on a
ZX-81
Then abandoned
the sciences until I
was 25! And was
shocked
But I was really lucky
Dot com boom gave me a
crash course in IT
(allowed to do
ANYTHING!)
Did machine learning not
financial engineering!
Lots of business experience,
especially in media
(Channel 4, ITV, News UK,
McDonalds, Unilever, AOL,
Credit-Suisse, Jaguar,
Sainsbury’s)
3
Areas of Expertise
classical statistics
(R, SPSS, SAS, matlab)
bayesian statistics
(R, winbugs)
simulation
(agent-based, system dynamics)
big data
(aws, hadoop, hive, spark, mahout, mongodb)
machine learning
(R, mahout, mllib)
coding
(R, python, java, sql, javascript, d3)
4
Some past projects Machine4 at Channel 4
The Content Universe at
Channel 4
Market Simulation at
mindshare
Bayesian and mixed
effects modelling at
mindshare
Drunks and Lampposts
5
Some of the things we will be looking at today
● How to build the right model to answer a question and quickly!
● Picking the right function for the job
● Some unexpected ways to use statistical techniques
● Understanding the limitations of your model
● Taking it further
○ Using simulation to understand its dynamics
○ Using Monte Carlo simulation to understand the impact of
uncertainty in the inputs
○ Using Bayesian inference to see how the data and the model
impact current beliefs
6
To begin with a controversial statement!
The majority of statistical models used in business are either unnecessary or
used inappropriately.
There’s a reluctance to ask why a statistical model is needed and whether it is worth
the effort of development.
In many cases we would be better served by clear thinking about a specific problem
(how the data relates to the business decision) resorting to statistical modelling (as
opposed to plain old fashioned mathematical modelling) only where the benefits are
obvious.
7
So what does make a good model?
A good model in this sense has the following virtues. (They might seem obvious but it
is surprising how often they are forgotten!)
● It captures all the features of the world that are relevant to the decision and
leaves out those which are not
● Its purpose is to relate the available data to the decision
● It only uses statistical theory when the benefits outweigh the costs
● It incorporates common sense assumptions
● It incorporates uncertainty
● Its inadequacies are understood and communicated to the decision maker
8
Some wisdom to keep in the back of your head
There is a quote attributed to John Tukey (himself a founding figure in statistics)
“An approximate answer to the right problem is worth a good deal more
“than an exact answer to an approximate problem.”
And another very popular but always true (almost by definition) quote by George Box
“All models are wrong but some are useful”
9
Now for a real decision and some data
The decision: The CMO has to decide on next year's marketing budget. She would like
to how much she should spend in total on product P.
The available data are:
● A time series of weekly sales for product P going back five years
● A time series of weekly marketing spend for product P going back five years
● Annual sales figures for P and its three main competitors going back five years
● Annual marketing spend for P and its three main competitors going back five
years
● Some research showing the demographic profile of buyers of product P and the
amount of switching there is in the category
10
What they never mention in the text books!
The work needs to be done in a day and there is only one person who can
work on it. (Note the time and resource constraints have a huge impact on
the choice of approach)
11
The paranoid statistician’s checklist
● Is it representative?
● How well does it cover all the
possibilities?
● Is it accurate?
● Are there missing values?
12
Always start by looking at the data
13
The next move: add as much info as you can
Where can you find this information?
1. Common sense
2. Questions to the decision maker (or anyone else who
understands the domain)
3. Logical constraints
14
And list all your common sense assumptions
(nothing is too obvious)
1. If you don't spend anything then there will be no uplift due to marketing spend!
2. There's a threshold below which any spend will be effective. Obviously if I spend only £10 nothing is
going to happen (unless it's bribing a single customer!)
3. There's an eventual limit to what marketing spend can do (it can't generate more sales than there
are people who can buy the product)
4. It's likely that marketing spend will be most effective on those who are least loyal to a competitor
brand
5. For business/political reasons there's a minimum and a maximum possible budget available
6. The effectiveness of marketing spend will be constrained by the reach of our marketing channels
7. The effectiveness of marketing spend will be determined by competitor spend
8. There will be a default position which the decision maker resorts to in the absence of any
information from you (e.g. spend the same as last year)
9. There's a whole load of other factors (creative, choice of channels, overall strategy) that will affect
the impact of the marketing spend
15
You can tame a problem by picking the right
function
16
We have good
reasons for picking
this one
The problem is reduced to finding values for the
parameters
Some barmat calculations for L:
11.5 million men who would buy the product
product lasts 2 weeks
cost £1
max annual sales 26x11.5= 300 million
sales of all four brands are 290 million so 10
million headroom
90% are loyal buyers, 10% switch regularly
P has 50% of the market and so has 5% of the
10% but another 5% available.
0.05 x 290 + 0.62 x 10 = 21 million
only 15% reachable by media 21x0.15 = 3 million
17
Does this seem very very
rough? Yes. But are taking
note of that. Later we will
look at how sensitive our
results are to these
assumptions.
The data should help us here but … an impasse: we
don’t have the uplifts
Call in the econometricians for a 3 month project?
Are we really stuck
though?
18
The solution is common sense and some nice tricks!
19
Yes it’s rough but it does the job: we can make
decisions
20
And now the important thing is understanding how it
is wrong and what that means!
1. Competitors not dealt with
2. Conditional on assumptions
3. Confounding factors
4. Scale of precision
5. Not a statistical model
21
Nevertheless….
Another example using the logistic curve
A web start-up has just launched its new product. Customers pay per day to use the product so
the number of customers can drop as well as rise over time. However word does seem to be
spreading as the daily number of customers appears to be climbing
They want to know two things
1. When should they spend their marketing budget?
2. For financial planning purposes they would like to know when the adoption curve will start
to level out. They have done their own market sizing work and they estimate that this will
happen at about 4000 customers a day. At their most pessimistic they put it at 3000 and at
the most optimistic they say 5000.
22
We can use the simulation to understand the impact
of feedback loops
23
And we can use Monte Carlo simulation to explore
the impact of uncertainty
A wide concept but in our case we are talking about using computer simulated random
sampling to model the effect of uncertainty in the inputs to a system on the outputs of that
system
1. Define inputs
2. Generate inputs from probability distribution
3. Perform computation on inputs
4. Aggregate results
24
Finally we might be interested in what the data says
about our assumptions
A Bayesian example: A wet umbrella
● Prior belief = Fairly certain it is not raining
● Data = Man walks into the room with a wet umbrella
● Model = Wet umbrellas highly improbable without rain
● Posterior belief: Shifted to fairly certain it is raining
25
We can use Bayesian methods to understand how
the data might update our beliefs about L
26
A quick recap
● How to build the right model to answer a question and quickly!
● Picking the right function for the job
● Some unexpected ways to use statistical techniques
● Understanding the limitations of your model
● Taking it further
○ Using simulation to understand its dynamics
○ Using Monte Carlo simulation to understand the impact of
uncertainty in the inputs
○ Using Bayesian inference to see how the data and the model
impact current beliefs
27
28
Thank you
If you’d like to know more talk to me at simon@coppelia.io
Follow me on twitter @coppeliamla
Or visit my blog www.coppelia.io/blog

More Related Content

Similar to Modelling for decisions

Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
Product School
 
Sales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trendsSales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trends
Benny Van Calster
 
Digital analytics: Wrap-up (Lecture 12)
Digital analytics: Wrap-up (Lecture 12)Digital analytics: Wrap-up (Lecture 12)
Digital analytics: Wrap-up (Lecture 12)
Joni Salminen
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Richard Ingilby
 
Sales Forecasting for Management Consultants & Business Analysts
Sales Forecasting for Management Consultants & Business AnalystsSales Forecasting for Management Consultants & Business Analysts
Sales Forecasting for Management Consultants & Business Analysts
Asen Gyczew
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA Report
Think Ethnic
 
Big Data
Big DataBig Data
The End of Stability: Rethinking Strategy for an Uncertain Age
The End of Stability: Rethinking Strategy for an Uncertain AgeThe End of Stability: Rethinking Strategy for an Uncertain Age
The End of Stability: Rethinking Strategy for an Uncertain Age
Capgemini
 
Machine learning for Marketers
Machine learning for MarketersMachine learning for Marketers
Machine learning for Marketers
Fullstaak
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl Weir
Futurice
 
Data Driven Product Management - ProductTank Boston Feb '14
Data Driven Product Management - ProductTank Boston Feb '14Data Driven Product Management - ProductTank Boston Feb '14
Data Driven Product Management - ProductTank Boston Feb '14
Quantopian
 
MVP
MVPMVP
Why So Many Ads? An Introduction To Live Creative Optimisation
Why So Many Ads? An Introduction To Live Creative OptimisationWhy So Many Ads? An Introduction To Live Creative Optimisation
Why So Many Ads? An Introduction To Live Creative Optimisation
Automated Creative
 
Using Data To Inform Product Decisions - Cape Town, 26 March '15
Using Data To Inform Product Decisions - Cape Town, 26 March '15Using Data To Inform Product Decisions - Cape Town, 26 March '15
Using Data To Inform Product Decisions - Cape Town, 26 March '15
Marc Abraham
 
Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
 Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
Rick Bouter
 
Research Presentation: How Numbers are Powering the Next Era of Marketing
Research Presentation: How Numbers are Powering the Next Era of MarketingResearch Presentation: How Numbers are Powering the Next Era of Marketing
Research Presentation: How Numbers are Powering the Next Era of Marketing
MediaPost
 
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
DNS Entrepreneurship Center
 
Applying the powers of observation to supplier visits -mmg news letter octobe...
Applying the powers of observation to supplier visits -mmg news letter octobe...Applying the powers of observation to supplier visits -mmg news letter octobe...
Applying the powers of observation to supplier visits -mmg news letter octobe...
Thomas Tanel
 
Discussion Questions Chapter 15Terms in Review1Define or exp.docx
Discussion Questions Chapter 15Terms in Review1Define or exp.docxDiscussion Questions Chapter 15Terms in Review1Define or exp.docx
Discussion Questions Chapter 15Terms in Review1Define or exp.docx
edgar6wallace88877
 
Why Marketers Need Data to Survive
Why Marketers Need Data to SurviveWhy Marketers Need Data to Survive
Why Marketers Need Data to Survive
LogiXML
 

Similar to Modelling for decisions (20)

Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 
Sales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trendsSales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trends
 
Digital analytics: Wrap-up (Lecture 12)
Digital analytics: Wrap-up (Lecture 12)Digital analytics: Wrap-up (Lecture 12)
Digital analytics: Wrap-up (Lecture 12)
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
 
Sales Forecasting for Management Consultants & Business Analysts
Sales Forecasting for Management Consultants & Business AnalystsSales Forecasting for Management Consultants & Business Analysts
Sales Forecasting for Management Consultants & Business Analysts
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA Report
 
Big Data
Big DataBig Data
Big Data
 
The End of Stability: Rethinking Strategy for an Uncertain Age
The End of Stability: Rethinking Strategy for an Uncertain AgeThe End of Stability: Rethinking Strategy for an Uncertain Age
The End of Stability: Rethinking Strategy for an Uncertain Age
 
Machine learning for Marketers
Machine learning for MarketersMachine learning for Marketers
Machine learning for Marketers
 
Fantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl WeirFantastic Problems and Where to Find Them: Daryl Weir
Fantastic Problems and Where to Find Them: Daryl Weir
 
Data Driven Product Management - ProductTank Boston Feb '14
Data Driven Product Management - ProductTank Boston Feb '14Data Driven Product Management - ProductTank Boston Feb '14
Data Driven Product Management - ProductTank Boston Feb '14
 
MVP
MVPMVP
MVP
 
Why So Many Ads? An Introduction To Live Creative Optimisation
Why So Many Ads? An Introduction To Live Creative OptimisationWhy So Many Ads? An Introduction To Live Creative Optimisation
Why So Many Ads? An Introduction To Live Creative Optimisation
 
Using Data To Inform Product Decisions - Cape Town, 26 March '15
Using Data To Inform Product Decisions - Cape Town, 26 March '15Using Data To Inform Product Decisions - Cape Town, 26 March '15
Using Data To Inform Product Decisions - Cape Town, 26 March '15
 
Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
 Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
Sogeti on big data creating clarity - Report 1-4 on Big Data - Sogeti ViNT
 
Research Presentation: How Numbers are Powering the Next Era of Marketing
Research Presentation: How Numbers are Powering the Next Era of MarketingResearch Presentation: How Numbers are Powering the Next Era of Marketing
Research Presentation: How Numbers are Powering the Next Era of Marketing
 
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
8. azcibusinessplandevelopment pullingitalltogether2-2013final-130221112000-p...
 
Applying the powers of observation to supplier visits -mmg news letter octobe...
Applying the powers of observation to supplier visits -mmg news letter octobe...Applying the powers of observation to supplier visits -mmg news letter octobe...
Applying the powers of observation to supplier visits -mmg news letter octobe...
 
Discussion Questions Chapter 15Terms in Review1Define or exp.docx
Discussion Questions Chapter 15Terms in Review1Define or exp.docxDiscussion Questions Chapter 15Terms in Review1Define or exp.docx
Discussion Questions Chapter 15Terms in Review1Define or exp.docx
 
Why Marketers Need Data to Survive
Why Marketers Need Data to SurviveWhy Marketers Need Data to Survive
Why Marketers Need Data to Survive
 

Recently uploaded

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 

Recently uploaded (20)

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 

Modelling for decisions

  • 1. Modelling for Decisions Using Monte Carlo simulation, Bayesian inference and a lot of common sense
  • 2. A quick introduction Photo credits at www.coppelia. io/photo-credits/
  • 3. Who is this person? Simon Raper Founder of data sciences service company called COPPELIA Started coding when I was 8 on a ZX-81 Then abandoned the sciences until I was 25! And was shocked But I was really lucky Dot com boom gave me a crash course in IT (allowed to do ANYTHING!) Did machine learning not financial engineering! Lots of business experience, especially in media (Channel 4, ITV, News UK, McDonalds, Unilever, AOL, Credit-Suisse, Jaguar, Sainsbury’s) 3
  • 4. Areas of Expertise classical statistics (R, SPSS, SAS, matlab) bayesian statistics (R, winbugs) simulation (agent-based, system dynamics) big data (aws, hadoop, hive, spark, mahout, mongodb) machine learning (R, mahout, mllib) coding (R, python, java, sql, javascript, d3) 4
  • 5. Some past projects Machine4 at Channel 4 The Content Universe at Channel 4 Market Simulation at mindshare Bayesian and mixed effects modelling at mindshare Drunks and Lampposts 5
  • 6. Some of the things we will be looking at today ● How to build the right model to answer a question and quickly! ● Picking the right function for the job ● Some unexpected ways to use statistical techniques ● Understanding the limitations of your model ● Taking it further ○ Using simulation to understand its dynamics ○ Using Monte Carlo simulation to understand the impact of uncertainty in the inputs ○ Using Bayesian inference to see how the data and the model impact current beliefs 6
  • 7. To begin with a controversial statement! The majority of statistical models used in business are either unnecessary or used inappropriately. There’s a reluctance to ask why a statistical model is needed and whether it is worth the effort of development. In many cases we would be better served by clear thinking about a specific problem (how the data relates to the business decision) resorting to statistical modelling (as opposed to plain old fashioned mathematical modelling) only where the benefits are obvious. 7
  • 8. So what does make a good model? A good model in this sense has the following virtues. (They might seem obvious but it is surprising how often they are forgotten!) ● It captures all the features of the world that are relevant to the decision and leaves out those which are not ● Its purpose is to relate the available data to the decision ● It only uses statistical theory when the benefits outweigh the costs ● It incorporates common sense assumptions ● It incorporates uncertainty ● Its inadequacies are understood and communicated to the decision maker 8
  • 9. Some wisdom to keep in the back of your head There is a quote attributed to John Tukey (himself a founding figure in statistics) “An approximate answer to the right problem is worth a good deal more “than an exact answer to an approximate problem.” And another very popular but always true (almost by definition) quote by George Box “All models are wrong but some are useful” 9
  • 10. Now for a real decision and some data The decision: The CMO has to decide on next year's marketing budget. She would like to how much she should spend in total on product P. The available data are: ● A time series of weekly sales for product P going back five years ● A time series of weekly marketing spend for product P going back five years ● Annual sales figures for P and its three main competitors going back five years ● Annual marketing spend for P and its three main competitors going back five years ● Some research showing the demographic profile of buyers of product P and the amount of switching there is in the category 10
  • 11. What they never mention in the text books! The work needs to be done in a day and there is only one person who can work on it. (Note the time and resource constraints have a huge impact on the choice of approach) 11
  • 12. The paranoid statistician’s checklist ● Is it representative? ● How well does it cover all the possibilities? ● Is it accurate? ● Are there missing values? 12
  • 13. Always start by looking at the data 13
  • 14. The next move: add as much info as you can Where can you find this information? 1. Common sense 2. Questions to the decision maker (or anyone else who understands the domain) 3. Logical constraints 14
  • 15. And list all your common sense assumptions (nothing is too obvious) 1. If you don't spend anything then there will be no uplift due to marketing spend! 2. There's a threshold below which any spend will be effective. Obviously if I spend only £10 nothing is going to happen (unless it's bribing a single customer!) 3. There's an eventual limit to what marketing spend can do (it can't generate more sales than there are people who can buy the product) 4. It's likely that marketing spend will be most effective on those who are least loyal to a competitor brand 5. For business/political reasons there's a minimum and a maximum possible budget available 6. The effectiveness of marketing spend will be constrained by the reach of our marketing channels 7. The effectiveness of marketing spend will be determined by competitor spend 8. There will be a default position which the decision maker resorts to in the absence of any information from you (e.g. spend the same as last year) 9. There's a whole load of other factors (creative, choice of channels, overall strategy) that will affect the impact of the marketing spend 15
  • 16. You can tame a problem by picking the right function 16 We have good reasons for picking this one
  • 17. The problem is reduced to finding values for the parameters Some barmat calculations for L: 11.5 million men who would buy the product product lasts 2 weeks cost £1 max annual sales 26x11.5= 300 million sales of all four brands are 290 million so 10 million headroom 90% are loyal buyers, 10% switch regularly P has 50% of the market and so has 5% of the 10% but another 5% available. 0.05 x 290 + 0.62 x 10 = 21 million only 15% reachable by media 21x0.15 = 3 million 17 Does this seem very very rough? Yes. But are taking note of that. Later we will look at how sensitive our results are to these assumptions.
  • 18. The data should help us here but … an impasse: we don’t have the uplifts Call in the econometricians for a 3 month project? Are we really stuck though? 18
  • 19. The solution is common sense and some nice tricks! 19
  • 20. Yes it’s rough but it does the job: we can make decisions 20
  • 21. And now the important thing is understanding how it is wrong and what that means! 1. Competitors not dealt with 2. Conditional on assumptions 3. Confounding factors 4. Scale of precision 5. Not a statistical model 21 Nevertheless….
  • 22. Another example using the logistic curve A web start-up has just launched its new product. Customers pay per day to use the product so the number of customers can drop as well as rise over time. However word does seem to be spreading as the daily number of customers appears to be climbing They want to know two things 1. When should they spend their marketing budget? 2. For financial planning purposes they would like to know when the adoption curve will start to level out. They have done their own market sizing work and they estimate that this will happen at about 4000 customers a day. At their most pessimistic they put it at 3000 and at the most optimistic they say 5000. 22
  • 23. We can use the simulation to understand the impact of feedback loops 23
  • 24. And we can use Monte Carlo simulation to explore the impact of uncertainty A wide concept but in our case we are talking about using computer simulated random sampling to model the effect of uncertainty in the inputs to a system on the outputs of that system 1. Define inputs 2. Generate inputs from probability distribution 3. Perform computation on inputs 4. Aggregate results 24
  • 25. Finally we might be interested in what the data says about our assumptions A Bayesian example: A wet umbrella ● Prior belief = Fairly certain it is not raining ● Data = Man walks into the room with a wet umbrella ● Model = Wet umbrellas highly improbable without rain ● Posterior belief: Shifted to fairly certain it is raining 25
  • 26. We can use Bayesian methods to understand how the data might update our beliefs about L 26
  • 27. A quick recap ● How to build the right model to answer a question and quickly! ● Picking the right function for the job ● Some unexpected ways to use statistical techniques ● Understanding the limitations of your model ● Taking it further ○ Using simulation to understand its dynamics ○ Using Monte Carlo simulation to understand the impact of uncertainty in the inputs ○ Using Bayesian inference to see how the data and the model impact current beliefs 27
  • 28. 28 Thank you If you’d like to know more talk to me at simon@coppelia.io Follow me on twitter @coppeliamla Or visit my blog www.coppelia.io/blog