SlideShare a Scribd company logo
Big data Psychology
Vishal Singh
NYU-Stern
History of Data Collection
Database
Marketers
Advent of Retail
Scanner
WWW
Mobile/GPS RFID
Astronomy/
Census
Psychological Insights from Mundane choices
 Psychological research confined to small experiments primarily
on students (98% of all research is with WEIRD subjects!)
 Thesis: Seemingly innocuous information such as aggregated
measures of internet search or mundane choices of grocery
products can reveal aspects of our deep-rooted ideologies,
values, and personality traits
Secondary objective: Automated and fully replicable empirical work flow
Automated Analytics Workflow
Dynamic Reproducible Documents
o Data download/munging part of
document
o Documents are dynamic: Models,
graphics, analysis, and write-up is updated
with flow of new data
o Documents are Interactive and 100%
Reproducible to other co-workers (and
future self)
Nature of Modern Data
The x- V’s of Data
o Volume
o Velocity
o Variety (cognitive challenge lies here)
o Integrating and Harmonizing data from a variety of sources and
formats (numeric, text, image, videos, social media)
Major progress
(AWS/Goggle cloud)
Era of Open Source: Machine Learning / AI Algorithms
Implication: Use of Analytics/Machine learning is simply a good
business practice (almost a necessity) rather than a differentiator.
Example: “Variety” in Data
Online Reviews: Empirical Generalizations
Joint work with Poppy Zhang (Phd Student, NYU) & Karsten Hansen (UCSD)
Scope of the Data
(Clean data files & R codes @ onreviews.org)
Amazon.com: Entire database, all products (1998-2014)
IMDB: All movies (1999-2015)
Vacation Rentals (All of Airbn’b & Homeaway)
Glassdoor (Employee ratings of firms)
YELP (Selected categories & geographies)
Expedia
Volume of Data
Analytics for what?
Primary focus is understanding and insights.
Tools: Visualization, Econometric models,
Interpretable machine learning
Primary focus is deployment. Eg.
Classification of Spam, Banner Ad
Targeting
Example:
What makes a review helpful?
Are there systematic Gender differences?
Approach 1: Classification exercise, take labeled data and run
CNN/RNN on Text and labels of Helpfulness. Get over 90% accuracy
Making Data Usable
Broad Categorization of Variables
 Review Attributes
 Star ratings
 Timing/sequence
 Helpfulness (judged by others)
 Language use
1. Review length
2. Valence (Positive vs Negative)
3. Readability ( words/sentence)
Extract Features of
Customers/Products
Reviewer Attributes
 Heavy vs. occasional reviewer
 Purchase information (sometimes)
 Geography (sometime)
 Gender (proxies)
Product Attributes
 Hedonic/experiential/durable
 Average Rating
 Within category (e.g. Action vs. Comedy),
 Sales Rank
 Popularity ( accumulated number of reviews)
 Price
Once the data is
harmonized, analytics
is simplified drastically.
Example: Quantifying Image
Clarifai
Helpfulness of Review
What makes a review helpful?
Which type of review is most helpful?
1-Star, 3-Star, 5-Star
Psychology Literature
“There is a general bias, based on both innate predispositions and
experience, in animals and humans, to give greater weight to
negative entities (e.g., events, objects, personal traits)” Rozin &
Royzman (2011)
• Negative assessments are perceived as more
diagnostic, particularly when the assessment is
well-reasoned and elaborated at some length
5-Star Reviews Most Helpful
Price and Helpful% : Electronic Products
IMDB
 Review offers quick inference after movie release
 On average, each movie gets 147 reviews
 On average, each review gets 6 helpful votes out
of total 11 votes
Sample review for Avatar
Example 2:
Transaction Data
Context
 ACNielsen's Homescan Consumer Panel
 Detailed purchase histories (2004—2015)
 Use hand-held scanners to record every bar-coded item purchased
 Detailed demographic information
 Additional demographics supplemented using location
information (e.g., Religion, Conservative)
Store Level Data
(35K+ Stores, 2006—2015, All Categories)
Example 1:
Habitual Buying Behavior
(with Karsten Hansen)
Context
o Thought Experiment:
 Suppose you recorded your shopping history for every cereal,
toothpaste, detergent etc. for past 3 or 5 or 10 years
 What can we learn from this information?
 Questions to ask?
 Example: Habits vs. Variety seeking
 What would your product portfolio look like?
25
Positive=>Higher Concentration
Conservativeness (as measured by Voting & Religiosity) associated with:
 Preference for established brands
 Lower propensity to try new products
 Higher brand loyalty (repetitive buying)
Breaking Habits
Will a Fat Tax Work?
Small price differences when reflected in shelf
prices at the point of purchase, have significant
& long-term impact on food choices.
Previous Evidence
o Field Work
Econometric/data problems
Focus on Sales Tax
Industry Funded
Experimental Work 
Lab/Cafeteria/Vending Machines
Small non-representative samples
This Paper: Quasi Natural Experiment
$2.91 $2.91 $2.91 $2.90
$2.87
$2.73
$2.71
$2.60
$2.40
$2.45
$2.50
$2.55
$2.60
$2.65
$2.70
$2.75
$2.80
$2.85
$2.90
$2.95
Whole milk 2% milk 1% milk Skim milk
Uniform Price Non-Uniform Price
Depending on where you live and what supermarket chain you patronize, you see one of these patterns.
Milk Pricing in the US
Milk Pricing in the US
Vishal Singh, Stern School of Business, NYU 31
Non Flat Pricing
Primarily Non-Flat
Mixed
Primarily Flat
Flat Pricing
No Data Available
Southeast FMMO
Pennsylvania: Large milk
producer. State
regulations.
Uniform/Non-Uniform price
structure is consistent across
stores within a chain, even in
mixed states.
Upper Midwest FMMO: Wisconsin is
2nd largest producer
Central FMMO
Northeast FMMO
MidEast
FMMO
DATA
 1800 + supermarkets
 6 Years weekly data
 UPC level sales,
price, promotion etc.
 Counties represent
approximately 50% of
the population
a) Comparison of Demographic Profile between Flat and NonFlat Stores
Flat stores Non-Flat stores
Mean
Std
Dev Mean
Std
Dev p-value
Low income 18% 38% 21% 41% 0.08
High income 19% 39% 20% 40% 0.60
% Poverty 2% 1% 2% 1% 0.22
% Children 4% 1% 4% 1% 0.62
% College 39% 49% 41% 49% 0.58
% White 78% 19% 77% 19% 0.49
% Elderly 12% 4% 12% 5% 0.32
Population density 0.12 0.31 0.13 0.18 0.52
(b) (1) Regression of (Price Whole/ Price 2%) milk and (2) Variance Decomposition
(1) (2)
Estimate Std Error
% of explained variation
accounted for by:
Intercept 1.0393 (0.006)
Median Income -0.0017 (0.002) 0.06%
% HH Kids -0.0003 (0.001) 0.00%
% College -0.0005 (0.002) 0.01%
% White -0.0014 (0.001) 0.09%
Population Density -0.0003 (0.001) 0.00%
Wage 0.0028 (0.002) 0.14%
All retailers within 5 miles -0.0002 (0.001) 0.00%
Discount retailers within 10 miles -0.0021 (0.001) 0.18%
Marketing Order Fixed Effects Included 15.44%
Chain Fixed Effects Included 84.07%
R square 0.658
Is the Pricing Structure Exogenous?
Does it Change Behavior?
Large Response to Small Price Changes
3. Automation/Deployment
Automating Scientific Reporting
Example
Workflow for the Consumer Package Industry
American Politics
Final Thoughts
 Trends
o Data proliferation & Rapid advancement in scalable algorithms
o Era of open source: Standardization of analytical methods & algorithms
o Provided as a Service by Cloud Hosting providers
 Key:
 Intuition & Critical Thinking at every stage rather than a
“Ctrl-C Ctrl-V” approach
My Work: Streamlining this Analytical workflow with Dynamic
Reproducible Documents
Data Intelligence Analytics Deployment

More Related Content

Similar to Slalom

Field experiments
Field experimentsField experiments
Field experiments
veesingh
 
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
CharityComms
 
Informed Individual
Informed IndividualInformed Individual
Informed IndividualRune Forberg
 
Informed Individual description
Informed Individual descriptionInformed Individual description
Informed Individual description
Øystein Jakobsen
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
Kelly Page
 
Sharing Recipes for Staying Competitive – Sustainability
Sharing Recipes for Staying Competitive – SustainabilitySharing Recipes for Staying Competitive – Sustainability
Sharing Recipes for Staying Competitive – SustainabilityECR Community
 
Using nudge theory to achieve a competitive edge with your UX
Using nudge theory to achieve a competitive edge with your UXUsing nudge theory to achieve a competitive edge with your UX
Using nudge theory to achieve a competitive edge with your UX
Fresh Egg UK
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
raylenepotter
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1gueste87a4f
 
Survey & Questionnaire Design in Applied Marketing Research
Survey & Questionnaire Design in Applied Marketing ResearchSurvey & Questionnaire Design in Applied Marketing Research
Survey & Questionnaire Design in Applied Marketing Research
Kelly Page
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio
sarahdijulio
 
Basic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applicationsBasic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applications
YogitaKolekar1
 
Adv 206 spring 14 class 9 strat research 2
Adv 206 spring 14 class 9 strat research 2Adv 206 spring 14 class 9 strat research 2
Adv 206 spring 14 class 9 strat research 2Lucas Spain
 
Marketing L5: Marketing Research & Guest Speaker
Marketing L5: Marketing Research & Guest SpeakerMarketing L5: Marketing Research & Guest Speaker
Marketing L5: Marketing Research & Guest Speaker
Ahmed Eid
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of AnalyticsPrajakta Vaidya
 
Animal disease control and value chain practices: Incorporating economics and...
Animal disease control and value chain practices: Incorporating economics and...Animal disease control and value chain practices: Incorporating economics and...
Animal disease control and value chain practices: Incorporating economics and...
ILRI
 
Lesson1
Lesson1Lesson1
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Subrata Saharia
 

Similar to Slalom (20)

Field experiments
Field experimentsField experiments
Field experiments
 
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
Using Nudge Theory to achieve a competitive edge with your UX | Psychology of...
 
Informed Individual
Informed IndividualInformed Individual
Informed Individual
 
Informed Individual description
Informed Individual descriptionInformed Individual description
Informed Individual description
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
 
Sharing Recipes for Staying Competitive – Sustainability
Sharing Recipes for Staying Competitive – SustainabilitySharing Recipes for Staying Competitive – Sustainability
Sharing Recipes for Staying Competitive – Sustainability
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Using nudge theory to achieve a competitive edge with your UX
Using nudge theory to achieve a competitive edge with your UXUsing nudge theory to achieve a competitive edge with your UX
Using nudge theory to achieve a competitive edge with your UX
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
 
Survey & Questionnaire Design in Applied Marketing Research
Survey & Questionnaire Design in Applied Marketing ResearchSurvey & Questionnaire Design in Applied Marketing Research
Survey & Questionnaire Design in Applied Marketing Research
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio
 
Basic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applicationsBasic statistical & pharmaceutical statistical applications
Basic statistical & pharmaceutical statistical applications
 
Adv 206 spring 14 class 9 strat research 2
Adv 206 spring 14 class 9 strat research 2Adv 206 spring 14 class 9 strat research 2
Adv 206 spring 14 class 9 strat research 2
 
Marketing L5: Marketing Research & Guest Speaker
Marketing L5: Marketing Research & Guest SpeakerMarketing L5: Marketing Research & Guest Speaker
Marketing L5: Marketing Research & Guest Speaker
 
Democratization of Analytics
Democratization of AnalyticsDemocratization of Analytics
Democratization of Analytics
 
Animal disease control and value chain practices: Incorporating economics and...
Animal disease control and value chain practices: Incorporating economics and...Animal disease control and value chain practices: Incorporating economics and...
Animal disease control and value chain practices: Incorporating economics and...
 
Sue stanley
Sue stanleySue stanley
Sue stanley
 
Lesson1
Lesson1Lesson1
Lesson1
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 

More from veesingh

Brand Analytics
Brand AnalyticsBrand Analytics
Brand Analytics
veesingh
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progresso
veesingh
 
Pricing strategy progresso
Pricing strategy progressoPricing strategy progresso
Pricing strategy progresso
veesingh
 
Regressioin mini case
Regressioin mini caseRegressioin mini case
Regressioin mini case
veesingh
 
Identification1
Identification1Identification1
Identification1
veesingh
 
Brand Asset Case Study
Brand Asset Case StudyBrand Asset Case Study
Brand Asset Case Study
veesingh
 
Pricing Strategies for Brands
Pricing Strategies for BrandsPricing Strategies for Brands
Pricing Strategies for Brands
veesingh
 
Fat Tax Slideshow
Fat Tax SlideshowFat Tax Slideshow
Fat Tax Slideshow
veesingh
 
Correlation causality
Correlation causalityCorrelation causality
Correlation causality
veesingh
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learningveesingh
 
Obesity
ObesityObesity
Obesity
veesingh
 
Brand mining
Brand miningBrand mining
Brand mining
veesingh
 
D3M Commodity
D3M Commodity D3M Commodity
D3M Commodity
veesingh
 
D3M Online Reviews
D3M Online ReviewsD3M Online Reviews
D3M Online Reviews
veesingh
 
D3M Politics
D3M PoliticsD3M Politics
D3M Politics
veesingh
 

More from veesingh (15)

Brand Analytics
Brand AnalyticsBrand Analytics
Brand Analytics
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progresso
 
Pricing strategy progresso
Pricing strategy progressoPricing strategy progresso
Pricing strategy progresso
 
Regressioin mini case
Regressioin mini caseRegressioin mini case
Regressioin mini case
 
Identification1
Identification1Identification1
Identification1
 
Brand Asset Case Study
Brand Asset Case StudyBrand Asset Case Study
Brand Asset Case Study
 
Pricing Strategies for Brands
Pricing Strategies for BrandsPricing Strategies for Brands
Pricing Strategies for Brands
 
Fat Tax Slideshow
Fat Tax SlideshowFat Tax Slideshow
Fat Tax Slideshow
 
Correlation causality
Correlation causalityCorrelation causality
Correlation causality
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Obesity
ObesityObesity
Obesity
 
Brand mining
Brand miningBrand mining
Brand mining
 
D3M Commodity
D3M Commodity D3M Commodity
D3M Commodity
 
D3M Online Reviews
D3M Online ReviewsD3M Online Reviews
D3M Online Reviews
 
D3M Politics
D3M PoliticsD3M Politics
D3M Politics
 

Recently uploaded

In the Adani-Hindenburg case, what is SEBI investigating.pptx
In the Adani-Hindenburg case, what is SEBI investigating.pptxIn the Adani-Hindenburg case, what is SEBI investigating.pptx
In the Adani-Hindenburg case, what is SEBI investigating.pptx
Adani case
 
Auditing study material for b.com final year students
Auditing study material for b.com final year  studentsAuditing study material for b.com final year  students
Auditing study material for b.com final year students
narasimhamurthyh4
 
Company Valuation webinar series - Tuesday, 4 June 2024
Company Valuation webinar series - Tuesday, 4 June 2024Company Valuation webinar series - Tuesday, 4 June 2024
Company Valuation webinar series - Tuesday, 4 June 2024
FelixPerez547899
 
Authentically Social Presented by Corey Perlman
Authentically Social Presented by Corey PerlmanAuthentically Social Presented by Corey Perlman
Authentically Social Presented by Corey Perlman
Corey Perlman, Social Media Speaker and Consultant
 
Training my puppy and implementation in this story
Training my puppy and implementation in this storyTraining my puppy and implementation in this story
Training my puppy and implementation in this story
WilliamRodrigues148
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Boris Ziegler
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
JeremyPeirce1
 
The Influence of Marketing Strategy and Market Competition on Business Perfor...
The Influence of Marketing Strategy and Market Competition on Business Perfor...The Influence of Marketing Strategy and Market Competition on Business Perfor...
The Influence of Marketing Strategy and Market Competition on Business Perfor...
Adam Smith
 
Organizational Change Leadership Agile Tour Geneve 2024
Organizational Change Leadership Agile Tour Geneve 2024Organizational Change Leadership Agile Tour Geneve 2024
Organizational Change Leadership Agile Tour Geneve 2024
Kirill Klimov
 
amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05
marketing317746
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
RajPriye
 
-- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month ---- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month --
NZSG
 
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challengesEvent Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
Holger Mueller
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
agatadrynko
 
3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx
tanyjahb
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
agatadrynko
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
sarahvanessa51503
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
Aurelien Domont, MBA
 
Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
Aggregage
 
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdfBài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
daothibichhang1
 

Recently uploaded (20)

In the Adani-Hindenburg case, what is SEBI investigating.pptx
In the Adani-Hindenburg case, what is SEBI investigating.pptxIn the Adani-Hindenburg case, what is SEBI investigating.pptx
In the Adani-Hindenburg case, what is SEBI investigating.pptx
 
Auditing study material for b.com final year students
Auditing study material for b.com final year  studentsAuditing study material for b.com final year  students
Auditing study material for b.com final year students
 
Company Valuation webinar series - Tuesday, 4 June 2024
Company Valuation webinar series - Tuesday, 4 June 2024Company Valuation webinar series - Tuesday, 4 June 2024
Company Valuation webinar series - Tuesday, 4 June 2024
 
Authentically Social Presented by Corey Perlman
Authentically Social Presented by Corey PerlmanAuthentically Social Presented by Corey Perlman
Authentically Social Presented by Corey Perlman
 
Training my puppy and implementation in this story
Training my puppy and implementation in this storyTraining my puppy and implementation in this story
Training my puppy and implementation in this story
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
 
The Influence of Marketing Strategy and Market Competition on Business Perfor...
The Influence of Marketing Strategy and Market Competition on Business Perfor...The Influence of Marketing Strategy and Market Competition on Business Perfor...
The Influence of Marketing Strategy and Market Competition on Business Perfor...
 
Organizational Change Leadership Agile Tour Geneve 2024
Organizational Change Leadership Agile Tour Geneve 2024Organizational Change Leadership Agile Tour Geneve 2024
Organizational Change Leadership Agile Tour Geneve 2024
 
amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
 
-- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month ---- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month --
 
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challengesEvent Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
Event Report - SAP Sapphire 2024 Orlando - lots of innovation and old challenges
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
 
3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx3.0 Project 2_ Developing My Brand Identity Kit.pptx
3.0 Project 2_ Developing My Brand Identity Kit.pptx
 
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdfikea_woodgreen_petscharity_cat-alogue_digital.pdf
ikea_woodgreen_petscharity_cat-alogue_digital.pdf
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
 
Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
 
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdfBài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
Bài tập - Tiếng anh 11 Global Success UNIT 1 - Bản HS.doc.pdf
 

Slalom

  • 1. Big data Psychology Vishal Singh NYU-Stern
  • 2. History of Data Collection Database Marketers Advent of Retail Scanner WWW Mobile/GPS RFID Astronomy/ Census
  • 3. Psychological Insights from Mundane choices  Psychological research confined to small experiments primarily on students (98% of all research is with WEIRD subjects!)  Thesis: Seemingly innocuous information such as aggregated measures of internet search or mundane choices of grocery products can reveal aspects of our deep-rooted ideologies, values, and personality traits Secondary objective: Automated and fully replicable empirical work flow
  • 4. Automated Analytics Workflow Dynamic Reproducible Documents o Data download/munging part of document o Documents are dynamic: Models, graphics, analysis, and write-up is updated with flow of new data o Documents are Interactive and 100% Reproducible to other co-workers (and future self)
  • 5. Nature of Modern Data The x- V’s of Data o Volume o Velocity o Variety (cognitive challenge lies here) o Integrating and Harmonizing data from a variety of sources and formats (numeric, text, image, videos, social media) Major progress (AWS/Goggle cloud) Era of Open Source: Machine Learning / AI Algorithms Implication: Use of Analytics/Machine learning is simply a good business practice (almost a necessity) rather than a differentiator.
  • 6. Example: “Variety” in Data Online Reviews: Empirical Generalizations Joint work with Poppy Zhang (Phd Student, NYU) & Karsten Hansen (UCSD)
  • 7. Scope of the Data (Clean data files & R codes @ onreviews.org) Amazon.com: Entire database, all products (1998-2014) IMDB: All movies (1999-2015) Vacation Rentals (All of Airbn’b & Homeaway) Glassdoor (Employee ratings of firms) YELP (Selected categories & geographies) Expedia
  • 9. Analytics for what? Primary focus is understanding and insights. Tools: Visualization, Econometric models, Interpretable machine learning Primary focus is deployment. Eg. Classification of Spam, Banner Ad Targeting
  • 10. Example: What makes a review helpful? Are there systematic Gender differences? Approach 1: Classification exercise, take labeled data and run CNN/RNN on Text and labels of Helpfulness. Get over 90% accuracy
  • 11. Making Data Usable Broad Categorization of Variables  Review Attributes  Star ratings  Timing/sequence  Helpfulness (judged by others)  Language use 1. Review length 2. Valence (Positive vs Negative) 3. Readability ( words/sentence)
  • 12. Extract Features of Customers/Products Reviewer Attributes  Heavy vs. occasional reviewer  Purchase information (sometimes)  Geography (sometime)  Gender (proxies) Product Attributes  Hedonic/experiential/durable  Average Rating  Within category (e.g. Action vs. Comedy),  Sales Rank  Popularity ( accumulated number of reviews)  Price Once the data is harmonized, analytics is simplified drastically.
  • 14. Helpfulness of Review What makes a review helpful? Which type of review is most helpful? 1-Star, 3-Star, 5-Star
  • 15. Psychology Literature “There is a general bias, based on both innate predispositions and experience, in animals and humans, to give greater weight to negative entities (e.g., events, objects, personal traits)” Rozin & Royzman (2011) • Negative assessments are perceived as more diagnostic, particularly when the assessment is well-reasoned and elaborated at some length
  • 17. Price and Helpful% : Electronic Products
  • 18. IMDB  Review offers quick inference after movie release  On average, each movie gets 147 reviews  On average, each review gets 6 helpful votes out of total 11 votes Sample review for Avatar
  • 20. Context  ACNielsen's Homescan Consumer Panel  Detailed purchase histories (2004—2015)  Use hand-held scanners to record every bar-coded item purchased  Detailed demographic information  Additional demographics supplemented using location information (e.g., Religion, Conservative)
  • 21.
  • 22.
  • 23. Store Level Data (35K+ Stores, 2006—2015, All Categories)
  • 24. Example 1: Habitual Buying Behavior (with Karsten Hansen)
  • 25. Context o Thought Experiment:  Suppose you recorded your shopping history for every cereal, toothpaste, detergent etc. for past 3 or 5 or 10 years  What can we learn from this information?  Questions to ask?  Example: Habits vs. Variety seeking  What would your product portfolio look like? 25
  • 27. Conservativeness (as measured by Voting & Religiosity) associated with:  Preference for established brands  Lower propensity to try new products  Higher brand loyalty (repetitive buying)
  • 28. Breaking Habits Will a Fat Tax Work? Small price differences when reflected in shelf prices at the point of purchase, have significant & long-term impact on food choices.
  • 29. Previous Evidence o Field Work Econometric/data problems Focus on Sales Tax Industry Funded Experimental Work  Lab/Cafeteria/Vending Machines Small non-representative samples
  • 30. This Paper: Quasi Natural Experiment $2.91 $2.91 $2.91 $2.90 $2.87 $2.73 $2.71 $2.60 $2.40 $2.45 $2.50 $2.55 $2.60 $2.65 $2.70 $2.75 $2.80 $2.85 $2.90 $2.95 Whole milk 2% milk 1% milk Skim milk Uniform Price Non-Uniform Price Depending on where you live and what supermarket chain you patronize, you see one of these patterns. Milk Pricing in the US
  • 31. Milk Pricing in the US Vishal Singh, Stern School of Business, NYU 31 Non Flat Pricing Primarily Non-Flat Mixed Primarily Flat Flat Pricing No Data Available Southeast FMMO Pennsylvania: Large milk producer. State regulations. Uniform/Non-Uniform price structure is consistent across stores within a chain, even in mixed states. Upper Midwest FMMO: Wisconsin is 2nd largest producer Central FMMO Northeast FMMO MidEast FMMO DATA  1800 + supermarkets  6 Years weekly data  UPC level sales, price, promotion etc.  Counties represent approximately 50% of the population
  • 32. a) Comparison of Demographic Profile between Flat and NonFlat Stores Flat stores Non-Flat stores Mean Std Dev Mean Std Dev p-value Low income 18% 38% 21% 41% 0.08 High income 19% 39% 20% 40% 0.60 % Poverty 2% 1% 2% 1% 0.22 % Children 4% 1% 4% 1% 0.62 % College 39% 49% 41% 49% 0.58 % White 78% 19% 77% 19% 0.49 % Elderly 12% 4% 12% 5% 0.32 Population density 0.12 0.31 0.13 0.18 0.52 (b) (1) Regression of (Price Whole/ Price 2%) milk and (2) Variance Decomposition (1) (2) Estimate Std Error % of explained variation accounted for by: Intercept 1.0393 (0.006) Median Income -0.0017 (0.002) 0.06% % HH Kids -0.0003 (0.001) 0.00% % College -0.0005 (0.002) 0.01% % White -0.0014 (0.001) 0.09% Population Density -0.0003 (0.001) 0.00% Wage 0.0028 (0.002) 0.14% All retailers within 5 miles -0.0002 (0.001) 0.00% Discount retailers within 10 miles -0.0021 (0.001) 0.18% Marketing Order Fixed Effects Included 15.44% Chain Fixed Effects Included 84.07% R square 0.658 Is the Pricing Structure Exogenous?
  • 33. Does it Change Behavior?
  • 34. Large Response to Small Price Changes
  • 35.
  • 38.
  • 39. Example Workflow for the Consumer Package Industry
  • 41. Final Thoughts  Trends o Data proliferation & Rapid advancement in scalable algorithms o Era of open source: Standardization of analytical methods & algorithms o Provided as a Service by Cloud Hosting providers  Key:  Intuition & Critical Thinking at every stage rather than a “Ctrl-C Ctrl-V” approach My Work: Streamlining this Analytical workflow with Dynamic Reproducible Documents Data Intelligence Analytics Deployment