Hybrid Intelligence

HYBRID INTELLIGENCE:
COMBINING THE POWER OF HUMAN
COMPUTATION AND MACHINE LEARNING
Fabio Casati
University of Trento
1

DEFINITIONS
2
Crowdsourcing: The practice of obtaining information or
input into a task or project by enlisting the services of a large
number of people, either paid or unpaid, typically via the
Internet.
Human-based computation: a computer science technique
in which a machine performs its function by outsourcing
certain steps to humans, usually as microwork

MARIE-JEAN-ANTOINE-NICOLAS DE CARITAT,
MARQUIS DE CONDORCET (1743-1794)
- French philosopher of the Enlightenment
and advocate of public education and
women rights (among lots of other
things)
- Éléments du calcul des probabilités, et
son application aux jeux de hasard, à la
loterie et aux jugements des hommes.
- “Jury theorem”
!3

SIR FRANCIS GALTON
(1822-1911)
Expert in pretty much everything
- Statistician, sociologist, psychologist, anthropologist,
eugenicist, tropical explorer, geographer,
meteorologist, psychometrician, and cake-cutter
- Created the statistical concept of correlation.
- introduced the use of questionnaires and surveys for
collecting data on human communities
- As the initiator of scientific meteorology, devised the
first weather map
- (Was the first to apply statistical methods to the
study of human differences and inheritance of
intelligence)
!4

THE WISDOM OF CROWDS
6
Weight: 1197 pounds
Median: (used by Galton): 1208 pounds
Mean: 1197 pounds.
Ken Wallis. Revisiting Francis Galton’s Forecasting Competition.
Statistical Science, 2014.
(787 paying participants)

HOW WOULDYOU SOLVETHIS
10
Greg Little, Lydia B. Chilton, Robert C. Miller, and Max Goldman
TurKit: Tools for Iterative Tasks on Mechanical Turk
HComp 2009

11
Greg Little, Lydia B. Chilton, Robert C. Miller, and Max
Goldman
TurKit: Tools for Iterative Tasks on Mechanical Turk
HComp 2009

VIZWIZ: NEARLY REAL-TIME ANSWERSTOVISUAL
QUESTIONS
12

CITIZEN SCIENCE: ZOONIVERSE
13

REST API
to People
Create task
Run batch*
Monitor
Results
Pay
Platform for human computations. But: how to program it? How to limit
recourse to (expensive) humans? how do we make their work more efﬁcient?

SYSTEMATIC LITERATURE REVIEW
Scientiﬁc and evidence-based approach to literature reviews

Systematic Literature reviews (SLR)
Process
Prevalence of antepartum hemorrhage in women with placenta previa: a systematic review and
meta-analysis. Dazhi Fan, Song Wu, Li Liu, Qing Xia,Wen Wang, Xiaoling Guo & Zhengping Liu.
Scientiﬁc Reports volume 7,Article number: 40320 (2017)
1. Study on adults 75 and older
2. Involves the use of interaction technology
3. Is an “intervention” (alternatively: RCT)
16

USEFUL BUT PAINFUL…
- Millions of papers published every year
- About half of them is never cited (not even by the authors)
- Incomplete (40-70% of missing papers!)
- From idea to submission: typically 9 to 36 months
- Query repeated multiple times (6-30 months apart), sometimes 60
- ~1/3 abandoned
17
Perrine Créquit, Ludovic Trinquart, Amélie Yavchitz, and Philippe Ravaud. 2016.
Wasted research when systematic reviews fail to provide a complete and up-to- date evidence
synthesis: the example of lung cancer. BMC Medicine 14, 1 (2016), 8.

19
Crowd-based Multi-Predicate Screening of Papers in Literature Reviews. WWW2018

!20
Trained ML models
CAN WE DO BETTER? CAN MACHINE LEARNING HELP?
• Help in screening (keep the same search+ﬁlter process but improve it)
• Help in ﬁnding (different process), or Live SLR
Crowdsourcing
Model training Trained ML models

ON RCT
21
Wallace et al
Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach
Jamia, 2017
predicted probability of being an
RCT of ≤0.1
speciﬁcity of 99.8% and an overall
recall of 98%

3 OPTIONS SO FAR
• Expert analysis: the typical approach today (painful, slow, and
expensive even if you don’t notice it)
• Crowdsourcing: Works well: speed, diversity, quality… but at a cost
• For scientists and experts. Hard to use it.
• Machine Learning and Classiﬁcation: Label, Train, Classify
• Works great only in some cases: fairly “easy” problem, very large pool
!22

HYBRID INTELLIGENCE
!23
ML Algorithms
Budget
Goals and
ConstraintsDecisions
?
Assets Problem
Results
“Side” goal
Trained ML models
Hybrid
(meta-)Algorithms
and processes
Trained ML
models *

APPLICABILITY
• Finite pool, uniqueness of the problem: Not enough items to
train
• Can’t get ML to the precision we need
• Or, we can, but it takes time and in the meantime we initially
leverage crowd heavily, then progressively less (e.g. crisis
situations)
!24

ML, THEN CROWD WHEN IN DOUBT
ML AlgorithmGet training data
Train algorithms
Apply: machine ﬁrst,
then (maybe) crowd
William Callaghan et al. MechanicalHeart: A Human-Machine Framework for the Classification of Phonocardiograms
CSCW 2018 !25
Trained ML
models
Trained ML
models
Works with weak algorithms for classification problems
(as long as confidence estimate is accurate)

26
when the crowd is more confident than the
machine in the classification of a given instance,
they are most often correct.
Works well only if we take machine input when it
is very confident
William Callaghan et al. MechanicalHeart: A Human-Machine Framework for the Classification of Phonocardiograms
CSCW 2018
A “sprinkle” of ML helps

ML AS ASSISTANT THAT BIASES OUR THINKING
ML AlgorithmGet training data
Train algorithms
!27
Trained ML
models
Trained ML
models
Apply: machine sets a
prior, crowd
Krivosheev et al. Combining Crowd and Machines for Multi-predicate Item Screening
CSCW 2018
P (class | votes) = P (votes | class) * p(class) / p(votes)
Impact on redundancy - always ask crowd

28
- Works with weak algorithms for classification problems
- “sprinkle of crowd” makes it right

EMBED CROWDS INSIDE MACHINE LEARNING
ARCHITECTURES
- Explore feature spaces that are largely unreachable by automatic
extraction,
- Train models that use human-understandable features
Cheng and Bernstein. Flock: Hybrid Crowd-Machine Learning Classifiers. CSCW2015
!29

30Cheng and Bernstein. Flock: Hybrid Crowd-Machine Learning Classifiers. CSCW2015

31
.1 improvement in ROC AUC
hybrid here used as features,
classification is automatic
Cheng and Bernstein. Flock: Hybrid Crowd-Machine Learning Classifiers. CSCW2015
Outliers are important

CROWD HELPS MACHINES HELP CROWD
• Bias the crowd to obtain better and faster (cheaper) responses
!32

34
Ramirez et al. Influencing workers: The case of human-machine collaboration
(in progress)

35
Ramirez et al. Influencing workers: The case of human-machine collaboration
(in progress)

Determinants of Primary
School Non- Enrollment
and Absenteeism
36

GENERAL FINITE POOL PROBLEM
• No clear idea on how well ML can do
• No clear idea on how well crowd can do (not to talk about task design)
• Limited items and limited budget: how to spend it?
• Kind of a meta-active learning problem, where in addition we have to learn how to learn
!37

SMALL STEPS: ACTIVE HYBRID LEARNING
• Given a set of hotel descriptions, find hotels that are kids-friendly
and that are near Macquire
• We have a ML algorithm given, and a crowd or hybrid classiﬁer
• It is a learning vs exploitation trade-off.
!38

ACTIVE HYBRID LEARNING
Restricted version of the general problem
1. Mange trade-off between labelling items to learn vs labeling to classify
2. Actively learn if favour ML or crowd, and then perform active sampling
!39
MAB or RL problem

41Krivosheev et al. Active Hybrid Classiﬁcation (under review)

(SEMI-) AUTOMATED PIPELINE GENERATION
DO CHECK FIX AGGREGATE
SUGGEST
FEATURESGROUPWORK
Model training
Active Learning
Hybrid strategy
TEMPLATES

(SEMI-) AUTOMATED PIPELINE GENERATION

PROCESS
Crowd Research: Open and Scalable University Laboratories,Vaish et al, UIST2017
44

PROCESS
- Open call
- Training materials (on SLRs in general, and SLRs on related topics)
- Screening task (acts also as selection filter)
- Paper assignment - full paper screening (also act as filter)
- Paper reading and “guided” paper summarization (with redundancy
and metadata extraction)
- Peer “grading” (positive, like-style)
- Definition of dimensions for analysis (separate subgroups)
- Selection of group leaders (also based on volunteering)
- Brainstorming in video call with PI and group leaders, each
presenting dimensions
- Second iteration
- Revisiting summaries of papers based on dimensions and filling of
tables
- Cross-check tables
45

ASSISTED TASK DESIGN
- How to define a task
- How to train
- How (much) to test
- Pricing
- Stopping
- Optimizing task assignment to workers
- Finding task design errors early
- => Assist in design for creative work

47
Task design Pipeline design
Conversational agent

SUMMING UP…
• Combining human and machine computation has incredible potential for solving a
variety of tasks
• Get results immediately, while improving ml
• Crisis situations
• Novel versions of old problems (from SLRs to fake news to criminal activities)
• Continuously check and improve areas where ML is weak, even with human-suggested
features
• Nothing of this is actually restricted to “crowd” - works with experts as well
• Move towards systems that do not require expertise, meaning, the average knowledge
worker can use it
!48

THANKS
fabio.casati@unitn.it
49

Hybrid Intelligence

Recommended

Recommended

More Related Content

Similar to Hybrid Intelligence

Similar to Hybrid Intelligence (20)

Recently uploaded

Recently uploaded (20)

Hybrid Intelligence