SlideShare a Scribd company logo
Brand Tracking with Bayesian
Models and Metaflow
Corrie Bartelheimer
Senior Data Scientist @ Latana
Berlin Bayesian, 10/11/2021
How many people have
heard of our brand?
How well known is the brand
in target group?
How do people perceive the
brand?
Have there been changes?
What makes Brand Tracking difficult?
Survey Problems
Small target groups
Signal or Noise?
Representativeness of respondents
Traditional Approaches
Weighting
Weighting
Weighting
Quota Sampling
Quota Sampling
Can take a while...
Introducing: Mr. P
Multilevel Regression &
Poststratification
Multilevel Regression
Multilevel Regression
Multilevel Regression
Multilevel Regression
Multilevel Regression
Multilevel Regression
Multilevel
Multilevel Regression
♂
♀
Multilevel Regression
♂
♀
Multilevel Regression
Multilevel Regression
Poststratification
Poststratification
Poststratification
Weight
Prediction
from our
Model
Poststratification
Poststratification
On scale: Metaflow
One model per answer option
One model per answer option
One question per brand, different
competitor brands per market
~500 - 1500 models per project
One model per answer option
One question per brand, different
competitor brands per market
~500 - 1500 models per project
~20min per model
= ~10 days compute time
Integrates with AWS Batch
Easy to use for Data Scientists
Supports reproducibility
MRP as Metaflow
from metaflow import FlowSpec,step
class MRPFlow(FlowSpec):
@step
def start(self):
Self.data, self.questions = load_data()
self.next(self.run_model, foreach="questions")
@step
def run_model(self):
question = self.input
self.result = run_mrp(question, self.data)
self.next
@step
def join(self, inputs):
for result in inputs:
save(result)
self.next(self.end)
@step
def end(self):
pass
MRP as Metaflow
from metaflow import FlowSpec,step
class MRPFlow(FlowSpec):
@step
def start(self):
Self.data, self.questions = load_data()
self.next(self.run_model,
foreach="questions")
@step
def run_model(self):
question = self.input
self.result = run_mrp(question, self.data)
self.next
@step
def join(self, inputs):
for result in inputs:
save(result)
self.next(self.end)
@step
def end(self):
MRP as Metaflow
from metaflow import FlowSpec,step
class MRPFlow(FlowSpec):
@step
def start(self):
Self.data, self.questions = load_data()
self.next(self.run_model,
foreach="questions")
@step
def run_model(self):
question = self.input
self.result = run_mrp(question, self.data)
self.next
@step
def join(self, inputs):
for result in inputs:
save(result)
self.next(self.end)
@step
def end(self):
Parallelizing models
@step
def start(self):
Self.data, self.questions = load_data()
self.next(self.run_model, foreach="questions")
@step
def run_model(self):
question = self.input
self.result = run_mrp(question, self.data)
self.next
Increasing resources
@step
def start(self):
Self.data, self.questions = load_data()
self.next(self.run_model, foreach="questions")
@resources(cpu=8, memory=32000)
@step
def run_model(self):
question = self.input
self.result = run_mrp(question, self.data)
self.next
Mr.P on AWS
Challenges
Convergence
How to monitor convergence of
1000+ models?
More predictor variables
Full joint distribution needed of all
predictor variables.
Summary
● Multilevel-regression improves errors by using grouped structure
● Propagation of uncertainty improves weighting
Corrie Bartelheimer
Senior Data Scientist
corrie.bartelheimer@latana.com
Introductory book on Bayesian Statistics: https://xcelab.net/rm/statistical-rethinking/
Stan: https://mc-stan.org/
Stan interface brms (R): https://paul-buerkner.github.io/brms/
MRP: Forecasting elections with non-representative polls https://www.sciencedirect.com/science/article/abs/pii/S0169207014000879
Metaflow https://metaflow.org/
MRP at Latana:
- https://latana.com/whitepapers/mrp-vs-traditional-quota-sampling-brand-tracking/
- https://aws.amazon.com/blogs/startups/brand-tracking-with-bayesian-statistics-and-aws-batch/
Resources and
Links

More Related Content

Similar to Brand tracking with Bayesian Models and Metaflow

Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online Classification
Prasad Chalasani
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
CloudxLab
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
knowbigdata
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
ShubhWadekar
 
Object Oriented Analysis and Design with UML2 part2
Object Oriented Analysis and Design with UML2 part2Object Oriented Analysis and Design with UML2 part2
Object Oriented Analysis and Design with UML2 part2
Haitham Raik
 
[PythonPH] Transforming the call center with Text mining and Deep learning (C...
[PythonPH] Transforming the call center with Text mining and Deep learning (C...[PythonPH] Transforming the call center with Text mining and Deep learning (C...
[PythonPH] Transforming the call center with Text mining and Deep learning (C...
Paul Lo
 
LF_APIStrat17_Case Study: Cold Decision Trees
LF_APIStrat17_Case Study: Cold Decision TreesLF_APIStrat17_Case Study: Cold Decision Trees
LF_APIStrat17_Case Study: Cold Decision Trees
LF_APIStrat
 
Tienda Development Workshop - JAB11
Tienda Development Workshop - JAB11Tienda Development Workshop - JAB11
Tienda Development Workshop - JAB11
Daniele Rosario
 
Artificial Intelligence in Action
Artificial Intelligence in ActionArtificial Intelligence in Action
Artificial Intelligence in Action
Benjamin Ejzenberg
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
Databricks
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
Rupam Devnath
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
Bruno Capuano
 
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Fwdays
 
Data mining - Machine Learning
Data mining - Machine LearningData mining - Machine Learning
Data mining - Machine Learning
RupaDutta3
 
Working Effectively With Legacy Perl Code
Working Effectively With Legacy Perl CodeWorking Effectively With Legacy Perl Code
Working Effectively With Legacy Perl Codeerikmsp
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
UXDXConf
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
Databricks
 
Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profilingyingfeng
 
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflowAutomatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Databricks
 

Similar to Brand tracking with Bayesian Models and Metaflow (20)

Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online Classification
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
 
Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
 
Object Oriented Analysis and Design with UML2 part2
Object Oriented Analysis and Design with UML2 part2Object Oriented Analysis and Design with UML2 part2
Object Oriented Analysis and Design with UML2 part2
 
[PythonPH] Transforming the call center with Text mining and Deep learning (C...
[PythonPH] Transforming the call center with Text mining and Deep learning (C...[PythonPH] Transforming the call center with Text mining and Deep learning (C...
[PythonPH] Transforming the call center with Text mining and Deep learning (C...
 
LF_APIStrat17_Case Study: Cold Decision Trees
LF_APIStrat17_Case Study: Cold Decision TreesLF_APIStrat17_Case Study: Cold Decision Trees
LF_APIStrat17_Case Study: Cold Decision Trees
 
Tienda Development Workshop - JAB11
Tienda Development Workshop - JAB11Tienda Development Workshop - JAB11
Tienda Development Workshop - JAB11
 
Artificial Intelligence in Action
Artificial Intelligence in ActionArtificial Intelligence in Action
Artificial Intelligence in Action
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
 
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"Mariia Havrylovych "Active learning and weak supervision in NLP projects"
Mariia Havrylovych "Active learning and weak supervision in NLP projects"
 
Data mining - Machine Learning
Data mining - Machine LearningData mining - Machine Learning
Data mining - Machine Learning
 
Working Effectively With Legacy Perl Code
Working Effectively With Legacy Perl CodeWorking Effectively With Legacy Perl Code
Working Effectively With Legacy Perl Code
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Balancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine LearningBalancing Automation and Explanation in Machine Learning
Balancing Automation and Explanation in Machine Learning
 
Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profiling
 
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflowAutomatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
 

Recently uploaded

SMM Cheap - No. 1 SMM panel in the world
SMM Cheap - No. 1 SMM panel in the worldSMM Cheap - No. 1 SMM panel in the world
SMM Cheap - No. 1 SMM panel in the world
smmpanel567
 
Unlocking Everyday Narratives: The Power of Storytelling in Marketing - Chad...
Unlocking Everyday Narratives: The Power of Storytelling in Marketing  - Chad...Unlocking Everyday Narratives: The Power of Storytelling in Marketing  - Chad...
Unlocking Everyday Narratives: The Power of Storytelling in Marketing - Chad...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Winning local SEO in the Age of AI - Dennis Yu
Winning local SEO in the Age of AI - Dennis YuWinning local SEO in the Age of AI - Dennis Yu
Top 3 Ways to Align Sales and Marketing Teams for Rapid Growth
Top 3 Ways to Align Sales and Marketing Teams for Rapid GrowthTop 3 Ways to Align Sales and Marketing Teams for Rapid Growth
Top 3 Ways to Align Sales and Marketing Teams for Rapid Growth
Demandbase
 
Coca Cola Branding Strategy and strategic marketing plan
Coca Cola Branding Strategy and strategic marketing planCoca Cola Branding Strategy and strategic marketing plan
Coca Cola Branding Strategy and strategic marketing plan
Maswer Ali
 
Monthly Social Media News Update May 2024
Monthly Social Media News Update May 2024Monthly Social Media News Update May 2024
Monthly Social Media News Update May 2024
Andy Lambert
 
ThinkNow 2024 Consumer Financial Wellness Report
ThinkNow 2024 Consumer Financial Wellness ReportThinkNow 2024 Consumer Financial Wellness Report
ThinkNow 2024 Consumer Financial Wellness Report
ThinkNow
 
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Digital Marketing Trends - Experts Insights on How
Digital Marketing Trends - Experts Insights on HowDigital Marketing Trends - Experts Insights on How
Is AI-Generated Content the Future of Content Creation?
Is AI-Generated Content the Future of Content Creation?Is AI-Generated Content the Future of Content Creation?
Is AI-Generated Content the Future of Content Creation?
Cut-the-SaaS
 
May 2024 - VBOUT Partners Meeting Group Session
May 2024 - VBOUT Partners Meeting Group SessionMay 2024 - VBOUT Partners Meeting Group Session
May 2024 - VBOUT Partners Meeting Group Session
Vbout.com
 
De-risk Your Digital Evolution - Hannah Grap
De-risk Your Digital Evolution - Hannah GrapDe-risk Your Digital Evolution - Hannah Grap
How to Run Landing Page Tests On and Off Paid Social Platforms
How to Run Landing Page Tests On and Off Paid Social PlatformsHow to Run Landing Page Tests On and Off Paid Social Platforms
How to Run Landing Page Tests On and Off Paid Social Platforms
VWO
 
Core Web Vitals SEO Workshop - improve your performance [pdf]
Core Web Vitals SEO Workshop - improve your performance [pdf]Core Web Vitals SEO Workshop - improve your performance [pdf]
Core Web Vitals SEO Workshop - improve your performance [pdf]
Peter Mead
 
5 Big Bets for 2024 - Jamie A. Lee, Stripes Co
5 Big Bets for 2024 - Jamie A. Lee, Stripes Co5 Big Bets for 2024 - Jamie A. Lee, Stripes Co
Digital Marketing Training In Bangalore
Digital Marketing Training In BangaloreDigital Marketing Training In Bangalore
Digital Marketing Training In Bangalore
syedasifsyed46
 
10 Videos Any Business Can Make Right Now! - Shelly Nathan
10 Videos Any Business Can Make Right Now! - Shelly Nathan10 Videos Any Business Can Make Right Now! - Shelly Nathan
10 Videos Any Business Can Make Right Now! - Shelly Nathan
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
Digital Marketing Trends - Experts Insights on How to Gain a Competitive EdgeDigital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel BussiusYour Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
DigiMarCon - Digital Marketing, Media and Advertising Conferences & Exhibitions
 
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User JourneysMastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
Search Engine Journal
 

Recently uploaded (20)

SMM Cheap - No. 1 SMM panel in the world
SMM Cheap - No. 1 SMM panel in the worldSMM Cheap - No. 1 SMM panel in the world
SMM Cheap - No. 1 SMM panel in the world
 
Unlocking Everyday Narratives: The Power of Storytelling in Marketing - Chad...
Unlocking Everyday Narratives: The Power of Storytelling in Marketing  - Chad...Unlocking Everyday Narratives: The Power of Storytelling in Marketing  - Chad...
Unlocking Everyday Narratives: The Power of Storytelling in Marketing - Chad...
 
Winning local SEO in the Age of AI - Dennis Yu
Winning local SEO in the Age of AI - Dennis YuWinning local SEO in the Age of AI - Dennis Yu
Winning local SEO in the Age of AI - Dennis Yu
 
Top 3 Ways to Align Sales and Marketing Teams for Rapid Growth
Top 3 Ways to Align Sales and Marketing Teams for Rapid GrowthTop 3 Ways to Align Sales and Marketing Teams for Rapid Growth
Top 3 Ways to Align Sales and Marketing Teams for Rapid Growth
 
Coca Cola Branding Strategy and strategic marketing plan
Coca Cola Branding Strategy and strategic marketing planCoca Cola Branding Strategy and strategic marketing plan
Coca Cola Branding Strategy and strategic marketing plan
 
Monthly Social Media News Update May 2024
Monthly Social Media News Update May 2024Monthly Social Media News Update May 2024
Monthly Social Media News Update May 2024
 
ThinkNow 2024 Consumer Financial Wellness Report
ThinkNow 2024 Consumer Financial Wellness ReportThinkNow 2024 Consumer Financial Wellness Report
ThinkNow 2024 Consumer Financial Wellness Report
 
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
Your Path to Profits - The Game-Changing Power of a Marketing OS for Your Bus...
 
Digital Marketing Trends - Experts Insights on How
Digital Marketing Trends - Experts Insights on HowDigital Marketing Trends - Experts Insights on How
Digital Marketing Trends - Experts Insights on How
 
Is AI-Generated Content the Future of Content Creation?
Is AI-Generated Content the Future of Content Creation?Is AI-Generated Content the Future of Content Creation?
Is AI-Generated Content the Future of Content Creation?
 
May 2024 - VBOUT Partners Meeting Group Session
May 2024 - VBOUT Partners Meeting Group SessionMay 2024 - VBOUT Partners Meeting Group Session
May 2024 - VBOUT Partners Meeting Group Session
 
De-risk Your Digital Evolution - Hannah Grap
De-risk Your Digital Evolution - Hannah GrapDe-risk Your Digital Evolution - Hannah Grap
De-risk Your Digital Evolution - Hannah Grap
 
How to Run Landing Page Tests On and Off Paid Social Platforms
How to Run Landing Page Tests On and Off Paid Social PlatformsHow to Run Landing Page Tests On and Off Paid Social Platforms
How to Run Landing Page Tests On and Off Paid Social Platforms
 
Core Web Vitals SEO Workshop - improve your performance [pdf]
Core Web Vitals SEO Workshop - improve your performance [pdf]Core Web Vitals SEO Workshop - improve your performance [pdf]
Core Web Vitals SEO Workshop - improve your performance [pdf]
 
5 Big Bets for 2024 - Jamie A. Lee, Stripes Co
5 Big Bets for 2024 - Jamie A. Lee, Stripes Co5 Big Bets for 2024 - Jamie A. Lee, Stripes Co
5 Big Bets for 2024 - Jamie A. Lee, Stripes Co
 
Digital Marketing Training In Bangalore
Digital Marketing Training In BangaloreDigital Marketing Training In Bangalore
Digital Marketing Training In Bangalore
 
10 Videos Any Business Can Make Right Now! - Shelly Nathan
10 Videos Any Business Can Make Right Now! - Shelly Nathan10 Videos Any Business Can Make Right Now! - Shelly Nathan
10 Videos Any Business Can Make Right Now! - Shelly Nathan
 
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
Digital Marketing Trends - Experts Insights on How to Gain a Competitive EdgeDigital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
Digital Marketing Trends - Experts Insights on How to Gain a Competitive Edge
 
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel BussiusYour Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
Your Path to Profits - The Game-Changing Power of a Marketing - Daniel Bussius
 
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User JourneysMastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
Mastering Multi-Touchpoint Content Strategy: Navigate Fragmented User Journeys
 

Brand tracking with Bayesian Models and Metaflow

Editor's Notes

  1. What is brand tracking? A company/brand is interested in how many people are aware of their brand (“brand awareness”).
  2. Brands are usually interested in the following questions: How is their brand faring with specific target groups (e.g. they might only be interested in women, or only in people living in big cities). Another question is how people perceive their brand, what do they associate with it (is it fun, or intuitive, etc…). Of course, one important question is also if there have been changes over time. This is also relevant if the brand wants to know if their marketing campaign was successful and had an impact.
  3. It’s easy to get a bunch of people to respond to an online survey but any target group (eg. women with kids) will most likely be small. This can make it hard (to impossible) to say if any change is due to a real change in how consumers see the brand or if it’s just a random change in the respondents composition. Also, online surveys are usually not very representative of the general population.
  4. Traditional approaches (and their shortcomings)
  5. One approach is to collect respondents data and weight the different groups afterwards. Let’s say we’re interested in both gender and the age.
  6. We can then use census data to determine weights for each demographic cell. For each cell, we estimate the mean from the respondents data, multiply with the weight and sum up to get an estimate of the general population.
  7. Imagine though we get data like this. Now the estimate based on only two respondents has the biggest weight and thus gets amplified. This means that small changes in this group can lead to big changes in the overall outcome. If we’re not interested in the overall estimate for the general population but for one of the subgroups, then (especially for the small groups) we get such large errors that the result is basically unusable.
  8. Quota sampling goes the other way around and sets numbers of how many responses should be collected from each group.
  9. The problem with quota sampling however, is that it can take a while to fill all cells with the number of responses needed. This also makes it more expensive. (Also possible to stop early with collecting data and then weight results).
  10. Paper: see e.g. Forecasting elections with non-representative polls https://www.sciencedirect.com/science/article/abs/pii/S0169207014000879
  11. First part: Multilevel Regression
  12. The “Bayesian part” of the method, a hierarchical model. Our variable of interest “knows brand” is modelled by a Bernoulli likelihood
  13. In our example from the previous slides, our predictor variables are gender and age.
  14. We model gender and age as random intercepts/hierarchical.
  15. Since it’s a Bayesian model, we also use priors of course.
  16. Some intuition: for the gender parameter, this means that we estimate two parameters: male and female.
  17. But since we group the two parameters, both come from a common distribution. This means that we allow both parameter to be different but we also think they should be similar. If there is little data for one group, then the model will estimate the parameter as close to the estimate for all related groups.
  18. In our example this means that any demographic cell borrows strength from neighboring cells (similar groups).
  19. The estimate for ♂, 30-60 is then the sum of the parameter for ♂ and the parameter 30-60. Both parameters also take into account the information from the other groups. Thus, each cell is not considered in isolation but we also use information from neighboring cells (similar groups)
  20. Second part: Poststratification (weighting)
  21. The poststratification part is basically the weighting method.
  22. To get an estimate for the general popluation, we compute the prediction from our model for the different cells (= proportion of men/women that know brand) and weight this with the proportion of men/women in the general population.
  23. Mathematically, the weight corresponds to the probability of gender and the prediction to the probability of knows brand conditioned on gender. Combining them gives the joint distribution pr(knows_brand, gender) according to the chain rule. Adding them up is equivalent to marginalizing out gender.
  24. The difference to the approach outlines before, is that the predictions are actually samples from our posterior instead of point estimates. Using a Bayesian model thus allows us to propagate the uncertainty.
  25. Back to our use case, on top of questions asking for “do you know this brand?” we also have questions like these.
  26. These questions are binarized, thus resulting in one model per option (here 6)
  27. The same question is also asked for competitor brands, there are different competitors for different markets which means we easily end up with around 1000 models even though just a handful of questions were asked.
  28. Each model takes around 20min, so in total we have 10 days of computing time. So many models. So little time.
  29. Metaflow Library: A library developed by Netflix https://metaflow.org/ It integrates nicely with AWS Batch, is easy to use even for Data Scientists that don’t have strong background in cloud computing and additionally also supports reproducibility.
  30. This is how MRP could look like in Metaflow.
  31. It’s a DAG: Each step represents a job launched on Batch, where the model step is parallelized, so one job per model is launched.
  32. The heavy lifting here is done using the “foreach” keyword. If questions is a list then with the foreach, metaflow will start one run_model job for each question. All the orchestration is being taken care of by metaflow.
  33. To increase resources, we can add just line
  34. Some remaining challenges: how to monitor convergence of more than 1000 models? How to add more predictor variables, especially custom variables specific to one brand for which no census data is available