Orchestrating Collective Intelligence

Bringing visibility to the world’s hardest-to-see places. 130 cities, 30 countries.

Modernizing
Economic
Measurement

“I have been constantly surprised at how little quantitative
information can be brought to bear on fundamental policy
questions [...] This experience illustrates the need for
flexibility in data collection, especially when policymakers
consider extending new policies or need to evaluate them in
real time for other reasons. Ideally, some sort of ‘rapid
response’ data gathering capacity.”
— Alan Krueger, “Stress Testing Economic Data”

“The collection of statistics needs to be modernized;
it is time to use the new technologies to start
collecting data.
…particularly important in developing countries
where the prevalence of mobile phones now offers
an unprecedented opportunity to measure the
economy.”
— Diane Coyle, “GDP”

“However, at this moment in survey research,
uncertainty reigns. Participation rates in
household surveys are declining throughout
the developed world. Surveys seeking high
response rates are experiencing crippling cost
inflation. Traditional sampling frames that
have been serviceable for decades are fraying
at the edges.”
— Robert Groves, “Three Eras of Survey
Research”

Orchestrating
Collective
Intelligence

PREMISE APP
Directed on-
the-ground
data acquisition

Crowdsourcing vs Orchestration

Crowdsourcing
survey survey tasks

Crowdsourcing
survey tasks workerssurvey

Orchestration
survey survey tasks

Orchestration
survey tasks workerssurvey

survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best
answered by via actual, on-the-
ground observation at scale.
Question is translated into
an internal “specification” of
the data points needed to
answer the question: type,
location, frequency,
coverage, etc.
Inventory of data points
automatically allocated to
data contributor pool, taking
into account budget, agent
profiles and geography.
Data points are dynamically
priced.
Contributors collect data in the
field using Android phones…
… which are sent back to the
Premise network.
QC is a mix of automated
(outlier detection; machine
learning; computer vision)
and manual (directed
sampling using oDesk)
checks.
Automated capabilities to
explore data and expose
trends or patterns;
hypothesize new features to
explain variation; suggest
specification refinement;
improve automated
verification.
end user
data
contributor
PLATFORM

Resource Scarcity
and
Access Risk

Average wait times are about ~10m longer in
Maracaibo than in Caracas.
Police are present ~80% of the time in
Maracaibo, but only 30-40% in Caracas.

survey campaign
analytics
end-user
data contributor
coverage, etc.
priced.
Premise network.
checks.
trends or patterns;
improve automated
verification.
end user
data
contributor
allocation
PLATFORM

survey campaign
analytics
end-user
data contributor
coverage, etc.
priced.
Premise network.
checks.
trends or patterns;
improve automated
verification.
end user
data
contributor
analytics
PLATFORM

survey campaign
analytics
end-user
data contributor
coverage, etc.
priced.
Premise network.
checks.
trends or patterns;
improve automated
verification.
end user
data
contributor
quality control
PLATFORM

locations
measurables
CAMPAIGN DEFINITION

locations
measurables
survey period 1
CAMPAIGN DEFINITION

locations
measurables
CAMPAIGN DEFINITION
survey period 1 survey period 2

locations
measurables
survey period 2
CAMPAIGN DEFINITION
survey period 1

locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1

locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1 2

locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period: 1 2 3

TASK COMPLETION RATE MODEL
payout
pTCR
“uptake risk”
Model features: user-history, task-history /
location-history, task-user, location-user
Issues: data sparsity in marginal vs
conditional, uptake counterfactuals (non-
iid sampling), path-dependence / lock-in
Linear functional model
}

Exploration
vs
Survey
Consistency

locations
measurables
period 1 period 2

Exploration vs Survey Consistency
- Campaign layers: separate discovery and survey
- Iteratively refine attribute and geospatial targeting
- Monitor correlation in item responses and
appearance of new attributes
- Monitor residual endogeneity

Coalitions vs Referrals
- Referrals are necessary to reach most remote areas
- However we need to be able to partition the
Premise graph into independent subnetworks, e.g.
for re-evaluation, experimentation and sample
stratification.

CONTRIBUTOR AFFINITY MODEL
Model features:
direct referral
account features
upload location
visit histories
geographic area
response correlation
Issues: bootstrapping affinity scores for
new users, optimal scheduler is
antagonistic for coalition discovery
Sampling from Large Graphs [Leskovec & Faloutsos; 2006]
weight

RECAP
- Orchestrating collective
intelligence
- Optimizing task allocation via
dynamic scheduling and incentives
- Exploration and discovery while
maintaining survey consistency
- Fraud and coalition formation in
networks

QUESTIONS?
instagram/premisedata
(all images in this talk)
joe@premise.com | @josephreisinger

PROOF PROOFAUTO QC PROOFMANUAL QC MANUAL QCREVALIDATION

“The problem of changing statistics is that you lose the
ability to compare across time. The longer the time-
series, the harder it is to change it, but you want to be
able to compare. How do you replace GDP? And if you
do, you lose the past sixty years of relevance. This has
been a problem for centuries—take the Spanish silver
trade. Anything you measure will become increasingly
irrelevant over time.”
— Hans Rosling
[Zachary Karabell, The Leading Indicators]

“You need to focus on quality.
You’ll be better off with a
small but carefully structured
sample rather than a large
sloppy sample.”
— Hal Varian, Google

“Big Data is bullshit”
— Harper Reed

Big Data, n.: the belief that
any sufficiently large pile of
shit contains a pony with
probability approaching one
—@grimmelm

Orchestrating Collective Intelligence

Orchestrating Collective Intelligence

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (17)

Similar to Orchestrating Collective Intelligence

Similar to Orchestrating Collective Intelligence (20)

More from Turi, Inc.

More from Turi, Inc. (20)

Recently uploaded

Recently uploaded (20)

Orchestrating Collective Intelligence