SlideShare a Scribd company logo
@josephreisinger
@premisedata
WHAT PREMISE MEASURES
Bringing visibility to the world’s hardest-to-see places. 130 cities, 30 countries.
Modernizing
Economic
Measurement
“I have been constantly surprised at how little quantitative
information can be brought to bear on fundamental policy
questions [...] This experience illustrates the need for
flexibility in data collection, especially when policymakers
consider extending new policies or need to evaluate them in
real time for other reasons. Ideally, some sort of ‘rapid
response’ data gathering capacity.”
— Alan Krueger, “Stress Testing Economic Data”
“The collection of statistics needs to be modernized;
it is time to use the new technologies to start
collecting data.
…particularly important in developing countries
where the prevalence of mobile phones now offers
an unprecedented opportunity to measure the
economy.”
— Diane Coyle, “GDP”
OMGWTFGDP
“However, at this moment in survey research,
uncertainty reigns. Participation rates in
household surveys are declining throughout
the developed world. Surveys seeking high
response rates are experiencing crippling cost
inflation. Traditional sampling frames that
have been serviceable for decades are fraying
at the edges.”
— Robert Groves, “Three Eras of Survey
Research”
Orchestrating
Collective
Intelligence
PREMISE APP
Directed on-
the-ground
data acquisition
Crowdsourcing vs Orchestration
Crowdsourcing
survey
Crowdsourcing
survey survey tasks
Crowdsourcing
survey tasks workerssurvey
Orchestration
survey
Orchestration
survey survey tasks
Orchestration
survey tasks workerssurvey
Orchestration
survey tasks workerssurvey
Orchestration
survey tasks workerssurvey
Orchestration
survey tasks workerssurvey
Orchestration
survey tasks workerssurvey
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
PLATFORM
Resource Scarcity
and
Access Risk
Average wait times are about ~10m longer in
Maracaibo than in Caracas.
Police are present ~80% of the time in
Maracaibo, but only 30-40% in Caracas.
Machine Learning
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
PLATFORM
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
allocation
PLATFORM
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
analytics
PLATFORM
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
quality control
PLATFORM
Optimizing Task
Allocation
TASKS
locations
measurables
CAMPAIGN DEFINITION
locations
measurables
CAMPAIGN DEFINITION
locations
measurables
CAMPAIGN DEFINITION
locations
measurables
CAMPAIGN DEFINITION
locations
measurables
CAMPAIGN DEFINITION
locations
measurables
survey period 1
CAMPAIGN DEFINITION
locations
measurables
CAMPAIGN DEFINITION
survey period 1 survey period 2
locations
measurables
survey period 2
CAMPAIGN DEFINITION
survey period 1
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1 2
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1 2 3
TASK COMPLETION RATE MODEL
payout
pTCR
“uptake risk”
Model features: user-history, task-history /
location-history, task-user, location-user
Issues: data sparsity in marginal vs
conditional, uptake counterfactuals (non-
iid sampling), path-dependence / lock-in
Linear functional model
}
Exploration
vs
Survey
Consistency
locations
measurables
period 1 period 2
locations
measurables
period 1 period 2
TASK REFINEMENT
ITERATIVE LOCATION DISCOVERY
Exploration vs Survey Consistency
- Campaign layers: separate discovery and survey
- Iteratively refine attribute and geospatial targeting
- Monitor correlation in item responses and
appearance of new attributes
- Monitor residual endogeneity
Fraud and
Coalition
Formation
Coalitions vs Referrals
- Referrals are necessary to reach most remote areas
- However we need to be able to partition the
Premise graph into independent subnetworks, e.g.
for re-evaluation, experimentation and sample
stratification.
CONTRIBUTOR AFFINITY MODEL
Model features:
direct referral
account features
upload location
visit histories
geographic area
response correlation
Issues: bootstrapping affinity scores for
new users, optimal scheduler is
antagonistic for coalition discovery
Sampling from Large Graphs [Leskovec & Faloutsos; 2006]
weight
RECAP
- Orchestrating collective
intelligence
- Optimizing task allocation via
dynamic scheduling and incentives
- Exploration and discovery while
maintaining survey consistency
- Fraud and coalition formation in
networks
QUESTIONS?
instagram/premisedata
(all images in this talk)
joe@premise.com | @josephreisinger
PROOF PROOFAUTO QC PROOFMANUAL QC MANUAL QCREVALIDATION
“The problem of changing statistics is that you lose the
ability to compare across time. The longer the time-
series, the harder it is to change it, but you want to be
able to compare. How do you replace GDP? And if you
do, you lose the past sixty years of relevance. This has
been a problem for centuries—take the Spanish silver
trade. Anything you measure will become increasingly
irrelevant over time.”
— Hans Rosling
[Zachary Karabell, The Leading Indicators]
“You need to focus on quality.
You’ll be better off with a
small but carefully structured
sample rather than a large
sloppy sample.”
— Hal Varian, Google
“Big Data is bullshit”
— Harper Reed
Big Data, n.: the belief that
any sufficiently large pile of
shit contains a pony with
probability approaching one
—@grimmelm
“dividing by bieber”
Orchestrating Collective Intelligence
Orchestrating Collective Intelligence

More Related Content

Viewers also liked

Managing Time as a Coach
Managing Time as a CoachManaging Time as a Coach
Managing Time as a Coach
RL Learning
 
Ob1 unit 4 chapter - 16 - conflict management
Ob1   unit 4 chapter - 16 - conflict managementOb1   unit 4 chapter - 16 - conflict management
Ob1 unit 4 chapter - 16 - conflict management
Dr S Gokula Krishnan
 
Biz Jrnl 071810
Biz Jrnl 071810Biz Jrnl 071810
Biz Jrnl 071810Vim Anand
 
Fuel cell presentation p26 31-7-1-2013
Fuel cell presentation p26 31-7-1-2013Fuel cell presentation p26 31-7-1-2013
Fuel cell presentation p26 31-7-1-2013
Pana Mann
 
Ob1 unit 4 chapter - 15 - power and politics
Ob1   unit 4 chapter - 15 - power and politicsOb1   unit 4 chapter - 15 - power and politics
Ob1 unit 4 chapter - 15 - power and politics
Dr S Gokula Krishnan
 
Bob’s training programs
Bob’s training programsBob’s training programs
Bob’s training programsBob Seshadri
 
Substance abuse slides
Substance abuse slidesSubstance abuse slides
Substance abuse slides
MVNPA
 
Untangling Graphs with GPU Clouds
Untangling Graphs with GPU CloudsUntangling Graphs with GPU Clouds
Untangling Graphs with GPU Clouds
Turi, Inc.
 
Screenplay - 'Kay'
Screenplay - 'Kay'Screenplay - 'Kay'
Screenplay - 'Kay'
skywalker97
 
Website Analysis
Website AnalysisWebsite Analysis
Website Analysis
skywalker97
 
NCL coaches technical meeting
NCL coaches technical meetingNCL coaches technical meeting
NCL coaches technical meeting
RL Learning
 
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Turi, Inc.
 
Advanced garments printing exam preparation
Advanced garments printing exam preparationAdvanced garments printing exam preparation
Advanced garments printing exam preparation
Azmir Latif Beg
 
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
Nan Wilawan
 
Zanim zostanę Twoim coachem...
Zanim zostanę Twoim coachem...Zanim zostanę Twoim coachem...
Zanim zostanę Twoim coachem...Agnieszka Kaseja
 
KTU- OB-1 - Unit 1 - Chapter - 2
KTU- OB-1 - Unit 1 - Chapter - 2KTU- OB-1 - Unit 1 - Chapter - 2
KTU- OB-1 - Unit 1 - Chapter - 2
Dr S Gokula Krishnan
 

Viewers also liked (17)

Managing Time as a Coach
Managing Time as a CoachManaging Time as a Coach
Managing Time as a Coach
 
David_Bermingham
David_BerminghamDavid_Bermingham
David_Bermingham
 
Ob1 unit 4 chapter - 16 - conflict management
Ob1   unit 4 chapter - 16 - conflict managementOb1   unit 4 chapter - 16 - conflict management
Ob1 unit 4 chapter - 16 - conflict management
 
Biz Jrnl 071810
Biz Jrnl 071810Biz Jrnl 071810
Biz Jrnl 071810
 
Fuel cell presentation p26 31-7-1-2013
Fuel cell presentation p26 31-7-1-2013Fuel cell presentation p26 31-7-1-2013
Fuel cell presentation p26 31-7-1-2013
 
Ob1 unit 4 chapter - 15 - power and politics
Ob1   unit 4 chapter - 15 - power and politicsOb1   unit 4 chapter - 15 - power and politics
Ob1 unit 4 chapter - 15 - power and politics
 
Bob’s training programs
Bob’s training programsBob’s training programs
Bob’s training programs
 
Substance abuse slides
Substance abuse slidesSubstance abuse slides
Substance abuse slides
 
Untangling Graphs with GPU Clouds
Untangling Graphs with GPU CloudsUntangling Graphs with GPU Clouds
Untangling Graphs with GPU Clouds
 
Screenplay - 'Kay'
Screenplay - 'Kay'Screenplay - 'Kay'
Screenplay - 'Kay'
 
Website Analysis
Website AnalysisWebsite Analysis
Website Analysis
 
NCL coaches technical meeting
NCL coaches technical meetingNCL coaches technical meeting
NCL coaches technical meeting
 
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create
 
Advanced garments printing exam preparation
Advanced garments printing exam preparationAdvanced garments printing exam preparation
Advanced garments printing exam preparation
 
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
7. การใช้งานระบบสารสนเทศในการบริหารสถานศึกษา
 
Zanim zostanę Twoim coachem...
Zanim zostanę Twoim coachem...Zanim zostanę Twoim coachem...
Zanim zostanę Twoim coachem...
 
KTU- OB-1 - Unit 1 - Chapter - 2
KTU- OB-1 - Unit 1 - Chapter - 2KTU- OB-1 - Unit 1 - Chapter - 2
KTU- OB-1 - Unit 1 - Chapter - 2
 

Similar to Orchestrating Collective Intelligence

Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Greg Makowski
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdf
RushikeshKulkarni71
 
Dwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basisDwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basis
nivatripathy93
 
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Laboratorio di Cultura Digitale, labcd.humnet.unipi.it
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
IRJET Journal
 
Prognosis - An Approach to Predictive Analytics- Impetus White Paper
Prognosis - An Approach to Predictive Analytics- Impetus White PaperPrognosis - An Approach to Predictive Analytics- Impetus White Paper
Prognosis - An Approach to Predictive Analytics- Impetus White Paper
Impetus Technologies
 
Dwd mdatamining intro-iep
Dwd mdatamining intro-iepDwd mdatamining intro-iep
Dwd mdatamining intro-iep
Ashish Kumar Thakur
 
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Saama
 
KM.doc
KM.docKM.doc
KM.docbutest
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
IRJET Journal
 
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
cscpconf
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
csandit
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
wkwsci-research
 
Mobile Crowdsensing with Mobile Agents
Mobile Crowdsensing with Mobile AgentsMobile Crowdsensing with Mobile Agents
Mobile Crowdsensing with Mobile Agents
Teemu Leppänen
 
Space Evaders Hacking for Diplomacy week 8
Space Evaders Hacking for Diplomacy week 8Space Evaders Hacking for Diplomacy week 8
Space Evaders Hacking for Diplomacy week 8
Stanford University
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
Souma Maiti
 
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerceOff-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
Ladislav Peska
 
Building Predictive Analytics on Big Data Platforms
Building Predictive Analytics on Big Data PlatformsBuilding Predictive Analytics on Big Data Platforms
Building Predictive Analytics on Big Data Platforms
Olha Hrytsay
 
Data mining java titles adrit solutions
Data mining java titles adrit solutionsData mining java titles adrit solutions
Data mining java titles adrit solutions
Adrit Techno Solutions
 
11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining
Alexander Decker
 

Similar to Orchestrating Collective Intelligence (20)

Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdf
 
Dwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basisDwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basis
 
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
 
Prognosis - An Approach to Predictive Analytics- Impetus White Paper
Prognosis - An Approach to Predictive Analytics- Impetus White PaperPrognosis - An Approach to Predictive Analytics- Impetus White Paper
Prognosis - An Approach to Predictive Analytics- Impetus White Paper
 
Dwd mdatamining intro-iep
Dwd mdatamining intro-iepDwd mdatamining intro-iep
Dwd mdatamining intro-iep
 
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
 
KM.doc
KM.docKM.doc
KM.doc
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
 
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
Csit65111ASSOCIATIVE REGRESSIVE DECISION RULE MINING FOR ASSOCIATIVE REGRESSI...
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
 
Mobile Crowdsensing with Mobile Agents
Mobile Crowdsensing with Mobile AgentsMobile Crowdsensing with Mobile Agents
Mobile Crowdsensing with Mobile Agents
 
Space Evaders Hacking for Diplomacy week 8
Space Evaders Hacking for Diplomacy week 8Space Evaders Hacking for Diplomacy week 8
Space Evaders Hacking for Diplomacy week 8
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerceOff-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
 
Building Predictive Analytics on Big Data Platforms
Building Predictive Analytics on Big Data PlatformsBuilding Predictive Analytics on Big Data Platforms
Building Predictive Analytics on Big Data Platforms
 
Data mining java titles adrit solutions
Data mining java titles adrit solutionsData mining java titles adrit solutions
Data mining java titles adrit solutions
 
11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining
 

More from Turi, Inc.

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
Turi, Inc.
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
Turi, Inc.
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
Turi, Inc.
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
Turi, Inc.
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
Turi, Inc.
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
Turi, Inc.
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Turi, Inc.
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
Turi, Inc.
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
Turi, Inc.
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
Turi, Inc.
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
Turi, Inc.
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
Turi, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Turi, Inc.
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
Turi, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Turi, Inc.
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
Turi, Inc.
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
Turi, Inc.
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
Turi, Inc.
 
SFrame
SFrameSFrame
SFrame
Turi, Inc.
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
Turi, Inc.
 

More from Turi, Inc. (20)

Webinar - Analyzing Video
Webinar - Analyzing VideoWebinar - Analyzing Video
Webinar - Analyzing Video
 
Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)Webinar - Pattern Mining Log Data - Vega (20160426)
Webinar - Pattern Mining Log Data - Vega (20160426)
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge DatasetsScaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
Scaling Up Machine Learning: How to Benchmark GraphLab Create on Huge Datasets
 
Pattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log DataPattern Mining: Extracting Value from Log Data
Pattern Mining: Extracting Value from Log Data
 
Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Machine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive ServicesMachine Learning in Production with Dato Predictive Services
Machine Learning in Production with Dato Predictive Services
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 

Orchestrating Collective Intelligence

  • 2.
  • 4. Bringing visibility to the world’s hardest-to-see places. 130 cities, 30 countries.
  • 6. “I have been constantly surprised at how little quantitative information can be brought to bear on fundamental policy questions [...] This experience illustrates the need for flexibility in data collection, especially when policymakers consider extending new policies or need to evaluate them in real time for other reasons. Ideally, some sort of ‘rapid response’ data gathering capacity.” — Alan Krueger, “Stress Testing Economic Data”
  • 7. “The collection of statistics needs to be modernized; it is time to use the new technologies to start collecting data. …particularly important in developing countries where the prevalence of mobile phones now offers an unprecedented opportunity to measure the economy.” — Diane Coyle, “GDP”
  • 8.
  • 10. “However, at this moment in survey research, uncertainty reigns. Participation rates in household surveys are declining throughout the developed world. Surveys seeking high response rates are experiencing crippling cost inflation. Traditional sampling frames that have been serviceable for decades are fraying at the edges.” — Robert Groves, “Three Eras of Survey Research”
  • 11.
  • 13.
  • 26. survey campaign allocation quality control analytics end-user data contributor User poses a question that is best answered by via actual, on-the- ground observation at scale. Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc. Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced. Contributors collect data in the field using Android phones… … which are sent back to the Premise network. QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks. Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification. end user data contributor PLATFORM
  • 28.
  • 29.
  • 30. Average wait times are about ~10m longer in Maracaibo than in Caracas. Police are present ~80% of the time in Maracaibo, but only 30-40% in Caracas.
  • 32. survey campaign allocation quality control analytics end-user data contributor User poses a question that is best answered by via actual, on-the- ground observation at scale. Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc. Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced. Contributors collect data in the field using Android phones… … which are sent back to the Premise network. QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks. Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification. end user data contributor PLATFORM
  • 33. survey campaign allocation quality control analytics end-user data contributor User poses a question that is best answered by via actual, on-the- ground observation at scale. Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc. Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced. Contributors collect data in the field using Android phones… … which are sent back to the Premise network. QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks. Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification. end user data contributor allocation PLATFORM
  • 34. survey campaign allocation quality control analytics end-user data contributor User poses a question that is best answered by via actual, on-the- ground observation at scale. Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc. Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced. Contributors collect data in the field using Android phones… … which are sent back to the Premise network. QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks. Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification. end user data contributor analytics PLATFORM
  • 35. survey campaign allocation quality control analytics end-user data contributor User poses a question that is best answered by via actual, on-the- ground observation at scale. Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc. Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced. Contributors collect data in the field using Android phones… … which are sent back to the Premise network. QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks. Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification. end user data contributor quality control PLATFORM
  • 37. TASKS
  • 38.
  • 47. locations measurables survey period 1 TASK ALLOCATION user 1 user 2 user 3 allocation period: 1
  • 48. locations measurables survey period 1 TASK ALLOCATION user 1 user 2 user 3 allocation period: 1 2
  • 49. locations measurables survey period 1 TASK ALLOCATION user 1 user 2 user 3 allocation period: 1 2 3
  • 50. TASK COMPLETION RATE MODEL payout pTCR “uptake risk” Model features: user-history, task-history / location-history, task-user, location-user Issues: data sparsity in marginal vs conditional, uptake counterfactuals (non- iid sampling), path-dependence / lock-in Linear functional model }
  • 52.
  • 53.
  • 54.
  • 59. Exploration vs Survey Consistency - Campaign layers: separate discovery and survey - Iteratively refine attribute and geospatial targeting - Monitor correlation in item responses and appearance of new attributes - Monitor residual endogeneity
  • 61.
  • 62. Coalitions vs Referrals - Referrals are necessary to reach most remote areas - However we need to be able to partition the Premise graph into independent subnetworks, e.g. for re-evaluation, experimentation and sample stratification.
  • 63. CONTRIBUTOR AFFINITY MODEL Model features: direct referral account features upload location visit histories geographic area response correlation Issues: bootstrapping affinity scores for new users, optimal scheduler is antagonistic for coalition discovery Sampling from Large Graphs [Leskovec & Faloutsos; 2006] weight
  • 64. RECAP - Orchestrating collective intelligence - Optimizing task allocation via dynamic scheduling and incentives - Exploration and discovery while maintaining survey consistency - Fraud and coalition formation in networks
  • 65. QUESTIONS? instagram/premisedata (all images in this talk) joe@premise.com | @josephreisinger
  • 66. PROOF PROOFAUTO QC PROOFMANUAL QC MANUAL QCREVALIDATION
  • 67. “The problem of changing statistics is that you lose the ability to compare across time. The longer the time- series, the harder it is to change it, but you want to be able to compare. How do you replace GDP? And if you do, you lose the past sixty years of relevance. This has been a problem for centuries—take the Spanish silver trade. Anything you measure will become increasingly irrelevant over time.” — Hans Rosling [Zachary Karabell, The Leading Indicators]
  • 68.
  • 69.
  • 70. “You need to focus on quality. You’ll be better off with a small but carefully structured sample rather than a large sloppy sample.” — Hal Varian, Google
  • 71. “Big Data is bullshit” — Harper Reed
  • 72. Big Data, n.: the belief that any sufficiently large pile of shit contains a pony with probability approaching one —@grimmelm
  • 73.