Reactive crowdsourcing

REACTIVE
CROWDSOURCING
Alessandro Bozzonab
Marco Brambillaa
Stefano Ceria
Andrea Mauria
aPolitecnico di Milano
Dipartimento di Elettronica, Informazione e BioIngegneria
bDelft University Of Technology
Department of Software And Computer Technology

Crowd Control is tough…
• There are several aspects that makes crowd
engineering complicated
• Task design, planning, assignment
• Workers discovery, assessment, engagement
Wednesday, May 15 Reactive Crowdsourcing 2
http://xkcd.com/1060/

• Goal: taming the crowd
• Cost
• Time
• Quality

• Goal: taming the crowd
• Cost
• Time
• Quality
• Motivation!
• Need for higher level abstractions and tools
• CONTROL as first-class citizen

Reactive Crowdsourcing
• A conceptual framework for modeling crowdsourcing
computations and control requirements
• Task Design
• Reactive Control Design
• Active Rule programming framework
• Declarative rule language
• A reactive execution environment for requirement
enforcement and reactive execution
• Based on the CrowdSearcher approach

Why Active Rules?
• Crowdsourcing control typically focuses on task data
• Execution results, agreement on truth value, workers performance
• An active rule approach can provide
• Ease of Use: control is easily expressible
• Simple control data structures
• Familiar formalism
• Power: support to arbitrarily complex controls
• Extensibility mechanisms
• Automation: most active rules can be system-generated
• Well-defined semantic
• Flexibility: simple control variants have localized impact on the
rules set
• Control isolation

The CrowdSearcher Approach
• Human-Enhanced data management with social networks
and Q&A systems as crowdsourcing platforms
• Example: search task (WWW2012)
Human Interaction Management
Social
Networks
Human
Computation
Platforms
Q&A
Search Execution
Engine
raction
ent
Query Interface
Social
Networks
ery Answer
Search Execution
Engine
anInteraction
anagement
Human
Query Interface
Local
Social
Networks
Q&A
Query Answer
Search Execution
Engine
HumanInteraction
Management
SE Access
Interface
Human
Access
Interface
Query Interface
Local
Source
Access
Interface
Social
Networks
Q&A
Crowd-
source
platforms
Query Answer
Search Execution
Engine
HumanInteraction
Management
SE Access
Interface
Human
Access
Interface
Query Interface
Local
Source
Access
Interface
Social
Networks
Q&A
Crowd-
source
platforms
Query Answer
Data Management System
Human Access
Interface
Remote Data
Access
Local Data
Access
Search Execution
Engine
HumanInteraction
Management
SE Access
Interface
Human
Access
Interface
Query Interface
Local
Source
Access
Interface
Social
Networks
Q&A
Crowd-
source
platforms
Query Answer
Search Execution
Engine
HumanInteraction
Management
SE Access
Interface
Human
Access
Interface
Query Interface
Local
Source
Access
Interface
Socia
Networ
Q&A
Crowd
source
platform
Query Answer
Task
Human-Enhanced
Data
Query
Results

• A simple abstract model
• A task receives a list of input objects
• Performers execute one or more operations upon them
• The task produces a list of crowd-manipulated objects
• A simple task design and deployment process, based on specific data
structures
• created using model-driven transformations
• driven by the task specification
The Design Process
I O

structures
The Design Process
Task
Specification
Task Planning
Control
Specification
• Task Spec: task operations, objects, and performers Dimension Tables

structures
The Design Process
Task
Specification
Task Planning
Control
Specification
• Task Planning: work distribution  Execution Table for task monitoring

structures
The Design Process
Task
Specification
Task Planning
Control
Specification
• Task Planning: work distribution  Execution Table for task monitoring
• Control Specification: task control policies  Control Mart

Task Specification_1/3
• Operation Types: Choice, Like, Score, Tag, Classify, Order, …
• Operation Parameters: e.g. classification classes
Task
tID
opType
categories
Task Specification Task Planning Control Specification
Task Configuration
t1
Classify 
Rep/Dem 

Politician
classifiedParty
lastName
photo
oID
• Input Objects Schema: typed attributes
• Output Attributes (according to task type)
Task
tID
opType
categories
Task Configuration
Object
Specification
 o1
 Obama
 http://….
 ?????

Politician
classifiedParty
lastName
photo
oID
Task Configuration
Object
Specification
Performer
Specification
• Execution platform(s)
• Qualifications, etc.
Task
tID
opType
categories Performer
name
pID
platform
 p1
 Alessandro
 Facebook

Task Planning_1/2
• Organize the task in MicroTasks, and allocate input objects
• μTaskObjectExecution  Designed for execution monitoring
• Track performers response
classifiedPartyplatform
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Task
tID
opType
name
pID
platform
Splitting
 mt1
 O1
 …
 …
… 
… 
Facebook

Task Planning_2/2
• Assign performers to MicroTasks on platforms
• Pull: dynamic assignment (First come - First served / Choice of the
performer)
• Push: static assignment (Performers’ priority / Performer matching)
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Task
tID
opType
name
pID
platform
Splitting Assignment
 mt1
 O1
 P1
 Republican
00:00:01 
00:00:10 
Facebook

Control Specification_1/4
• Status Variable: tracking task and performers status
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Performer
name
pID
status
platformTask
tID
opType
categories
status  Trusted/SpammerCreated/Planned/Closed 

• Object : tracking objects status
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Performer
name
pID
status
platformTask
tID
opType
categories
status
Object
Control #dem
oID
#eval
#rep
#curAnswer

• Object : tracking object responses
• Performer: tracking performer behavior (e.g. spammers)
Performer
Control #right
pID
#eval
#wrong
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Performer
name
pID
status
platformTask
tID
opType
categories
status
Object
Control #dem
oID
#eval
#rep
#curAnswer

• Object : tracking object responses
• Performer: tracking performer behavior (e.g. spammers)
• Task: tracking task status: closing @completion, re-plan
Task
Control#compObj
tID
#compExec
Performer
Control #right
pID
#eval
#wrong
μTaskObject
Execution
μtID
startTs
endTs
oID
pID
Politician
classifiedParty
lastName
photo
oID
Performer
name
pID
status
platformTask
tID
opType
categories
status
Object
Control #dem
oID
#eval
#rep
#curAnswer

Active Rules Language
• Active rules are expressed on the previous data
structures
• Event-Condition-Action paradigm

structures
• Events: data updates / timer
• ROW-level granularity
• OLD  before state of a row
• NEW  after state of a row
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]

structures
• Condition: a predicate that must be satisfied (e.g. conditions on
control mart attributes)
c: NEW.ClassifiedParty == ’Republican’

structures
• Condition: a predicate that must be satisfied (e.g. conditions on
control mart attributes)
• Actions: updates on data structures (e.g. change attribute
value, create new instances), special functions (e.g. replan)
c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1

e: UPDATE FOR ObjectControl
c: (NEW.Rep== 2) or (NEW.Dem == 2)
a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer,
SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example
Task
Control#compObj
tID Performer
Control
μTaskObject
Execution
Politician classifiedParty
oID
PerformerTask
Object
Control #dem
oID
#rep
#eval
tIDEvent

Rule Example
Task
Control#compObj
tID Performer
Control
μTaskObject
Execution
oID
PerformerTask
Object
Control #dem
oID
#rep
#eval
tID
Condition

Rule Example
Task
Control#compObj
tID Performer
Control
μTaskObject
Execution
oID
PerformerTask
Object
Control #dem
oID
#rep
#eval
tID
Action

Rule Programming Best Practice
• We define three classes of rules
μTaskObject
Execution
Performer
Control
Object
Control
Task
Control
Politician Performer Task

• Control rules: modifying the control tables;
μTaskObject
Execution
Performer
Control
Object
Control
Task
Control

• Result rules: modifying the dimension tables (object, performer, task);
μTaskObject
Execution
Performer
Control
Object
Control
Task
Control

• Top-to-bottom, left-to-right, evaluation
• Guaranteed termination
μTaskObject
Execution
Performer
Control
Object
Control
Task
Control

• Execution rules: modifying the execution table, either directly or through re-planning
μTaskObject
Execution
Performer
Control
Object
Control
Task
Control
• Termination must be proven (Rule precedence graph has cycles)

Experimental Evaluation
• GOAL: demonstrate the flexibility and expressive power
of reactive crowdsourcing
• 3 experiments, focused on Italian politicians
• Parties: Human Computation  affiliation classification
• Law: Game With a Purpose  guess the convicted politician
• Order: Pure Game  hot or not
• 1 week (November 2012)
• 284 distinct performers
• Recruited through public mailing lists and social networks
announcements
• 3500 Micro Tasks

Politician Affiliation
• Given the picture and name of a politician, specify his/her political
affiliation
• No time limit
• Performers are encouraged to look up online
• 2 set of rules
• Majority Evaluation
• Spammer Detection

Results – Majority Evaluation_1/3
30 object; object redundancy = 9;
Final object classification as simple majority after 7 evaluations

Results - Majority Evaluation_2/3
Majority @7
Early Majority @3 R @7
-27% executions
-18% precision
%ofCompl.Objects
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
#Executions
0 10 20 30 40 50 60 70 80 90
Final object classification as total majority after 3 evaluations
Otherwise, re-plan of 4 additional evaluations. Then simple majority at 7

Results - Majority Evaluation_3/3
Majority @7
Early Majority @3 R @7
Majority @3 R @5 R @7
-23% executions
+26% precision
+50% precision
%ofCompl.Objects
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
#Executions
0 10 20 30 40 50 60 70 80 90
Final object classification as total majority after 3 evaluations
Otherwise, simple majority at 5 or at 7 (with replan)

Results – Spammer Detection_1/2
New rule for spammer detection without ground truth
Performer correctness on final majority. Spammer if > 50% wrong classifications
Majority @3 R @5 R @7
Majority @3 R @5 R @7 - Spammer Detection
+46% executions
+1.5% precision
%ofCompl.Objects
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
#Executions
0 10 20 30 40 50 60 70 80 90

A Short CrowdSearcher Demo

Summary
• Results
• An integrated framework for crowdsourcing task design and control
• Well-structured control rules with some guarantees of termination
• Support for cross-platform crowd interoperability
• A working prototype  crowdsearcher.search-computing.org
• Forthcoming
• Exploitation of interoperability
• Expertise finding
• Dynamic planning
• Integration with other social-networks and human computation
platforms

QUESTIONS?

Reactive crowdsourcing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Reactive crowdsourcing

Similar to Reactive crowdsourcing (20)

More from Alessandro Bozzon

More from Alessandro Bozzon (11)

Recently uploaded

Recently uploaded (20)

Reactive crowdsourcing

Editor's Notes