SlideShare a Scribd company logo
1 of 27
Download to read offline
Exploiting Worker Correlation for
Label Aggregation in Crowdsourcing
Yuan Li, Benjamin Rubinstein, Trevor Cohn
Crowdsourcing
Templates provided by FigureEight (formerly CrowdFlower)2
We focus on aggregating discrete labels
Templates provided by FigureEight (formerly CrowdFlower)3
A matrix of worker labels Infer the truth…
T T ?
T F ?
T ? T
T ? F
? F F
Truth
?
?
?
?
?
Problem statement
Assume there are 𝑊 workers who classify 𝑁 items into 𝐾 categories.
Let 𝑧𝑖 be the latent true label of item 𝑖, 𝑦𝑖𝑗 the label that worker 𝑗 assigns to item 𝑖,
𝑊𝑖 the set of workers who have labelled item 𝑖.
Goal:
inferring the true labels 𝑍 = 𝑧𝑖 𝑖=1
𝑁
based on the observed worker labels 𝑌 = 𝑦𝑖𝑗 𝑗∈𝑊𝑖
𝑖=1
𝑁
5
Outline
Introduction
Preliminaries
• Probabilistic models
Our proposed method
• Enhanced Bayesian Classifier Combination (EBCC)
Results
• Synthetic data
• Real-world data
Discussion
Crowdsourcing
A matrix of worker labels Infer the truth…
T T ?
T F ?
T ? T
T ? F
? F F
Truth
?
?
?
?
?
Classifier combination
A matrix of classifier predictions Infer the truth…
T T F
T F T
T T T
T F F
F F F
Truth
?
?
?
?
?
Outline
Introduction
Preliminaries
• Probabilistic models
Our proposed method
• Enhanced Bayesian Classifier Combination (EBCC)
Results
• Synthetic data
• Real-world data
Discussion
All probabilistic aggregation models define
𝑝 𝑌, 𝑍 = ෑ
𝑖
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 𝑝(𝑧𝑖)
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 is parameterized by 𝑉
𝑉 captures worker reliability
𝑝 𝑌, 𝑍 𝑉 = ෑ
𝑖
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉 𝑝(𝑧𝑖)
10
Independent BCC (iBCC)
Models mainly differ in how 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 is parameterized by 𝑉
iBCC (Kim & Ghahramani, 2012) assumes conditional independence between 𝑦𝑖𝑗’s:
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෑ
𝑗
𝑝 𝑦𝑖𝑗 𝑧𝑖
where 𝑣𝑗𝑘𝑙 = 𝑝 𝑦𝑖𝑗 = 𝑙 𝑧𝑖 = 𝑘
• Easy to marginalize out unobserved 𝑦𝑖𝑗
• #paras in 𝑉 is 𝑂 𝑊𝐾2
14
Dependent BCC (dBCC)
dBCC (Kim & Ghahramani, 2012) uses a Markov field for 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 to capture
the correlation between𝑦𝑖𝑗’s
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉, 𝑈 =
1
𝐶 𝑉, 𝑈, 𝑧𝑖
exp ෍
1≤𝑗<𝑗′≤𝑊
𝑢𝑗𝑗′𝑦𝑖𝑗𝑦𝑖𝑗′
+ ෍
𝑗=1
𝑊
𝑣𝑗𝑧𝑖𝑦𝑖𝑗
However,
it’s intractable to marginalize out unobserved 𝑦𝑖𝑗 and #params in V and U is 𝑂(𝑊2𝐾2)
16
Outline
Introduction
Preliminaries
• Probabilistic models
Our proposed method
• Enhanced Bayesian Classifier Combination (EBCC)
Results
• Synthetic data
• Real-world data
Discussion
Worker A & B have labelled 20 items…
18
Class 0
0.8 0.1
0.1 0
≈
0.9
0.1
⨂
0.9
0.1
=
0.81 0.09
0.09 0.01
Conditional independence assumption
Class 1
0.4 0.1
0.1 0.4
≉
0.5
0.5
⨂
0.5
0.5
=
0.25 0.25
0.25 0.25
Conditional independence assumption
A B Truth
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
0 1 0
A B Truth
1 1 1
1 1 1
1 1 1
1 1 1
1 0 1
0 1 1
0 0 1
0 0 1
0 0 1
0 0 1
Use 2 rank-1 matrices…
19
0.4 0.1
0.1 0.4
≉
0.5
0.5
⨂
0.5
0.5
=
0.25 0.25
0.25 0.25
0.4 0.1
0.1 0.4
≈
1
2
0.1
0.9
⨂
0.1
0.9
+
1
2
0.9
0.1
⨂
0.9
0.1
=
0.41 0.09
0.09 0.41
There are 2 subtypes for class-1 items
20
0.4 0.1
0.1 0.4
≉
0.5
0.5
⨂
0.5
0.5
=
0.25 0.25
0.25 0.25
0.4 0.1
0.1 0.4
≈
1
2
0.1
0.9
⨂
0.1
0.9
+
1
2
0.9
0.1
⨂
0.9
0.1
=
0.41 0.09
0.09 0.41
In half of the time,
A & B both have
90% accuracy
In half of the time,
A & B both have
10% accuracy
They are labelling
easy items
They are labelling
hard items
Worker A & B have labelled 20 items…
21
Confusion matrix on class-level
Confusion matrix on subtype-level
A B Truth
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
0 1 0
A B Truth Subtype
1 1 1 Easy
1 1 1 Easy
1 1 1 Easy
1 1 1 Easy
1 0 1 ½ E + ½ H
0 1 1 ½ E + ½ H
0 0 1 Hard
0 0 1 Hard
0 0 1 Hard
0 0 1 Hard
Truth
Worker label
0 1
0 0.9 0.1
1 0.5 0.5
Truth Subtype
Worker label
0 1
0 - 0.9 0.1
1 Easy 0.1 0.9
1 Hard 0.9 0.1
In general, tensor rank decomposition can be used for modelling 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = 𝑘 = ෍
𝑚
𝜋𝑘𝑚 ⋅ Ԧ
𝑣1𝑘𝑚 ⊗ Ԧ
𝑣2𝑘𝑚 ⊗ ⋯ ⊗ Ԧ
𝑣𝑊𝑘𝑚
Or in a probabilistic way
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෍
𝑚
𝑝 𝑔𝑖 = 𝑚 𝑧𝑖 ⋅ ෑ
𝑗
𝑝 𝑦𝑖𝑗 𝑧𝑖, 𝑔𝑖 = 𝑚
𝑚 = 1 … 𝑀, where 𝑀 is the number of subtypes per class
𝑔𝑖 is the subtype of item 𝑖 under its class 𝑧𝑖
22
Enhanced BCC
More details about EBCC
Mean-field variational Bayes
Run multiple times, then select the solution with the highest ELBO
Use Dir(0.1) as the prior for 𝜋𝑘 to encourage the model to use fewer subtypes
24
Different ways to model 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖
independent BCC (iBCC)
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෑ
𝑗
𝑝 𝑦𝑖𝑗 𝑧𝑖
dependent BCC (dBCC)
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉, 𝑈 =
1
𝐶 𝑉, 𝑈, 𝑧𝑖
exp ෍
1≤𝑗<𝑗′≤𝑊
𝑢𝑗𝑗′𝑦𝑖𝑗𝑦𝑖𝑗′
+ ෍
𝑗=1
𝑊
𝑣𝑗𝑧𝑖𝑦𝑖𝑗
Our proposed enhanced BCC (EBCC)
𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෍
𝑚
𝑝 𝑔𝑖 = 𝑚 𝑧𝑖 ⋅ ෑ
𝑗
𝑝 𝑦𝑖𝑗 𝑧𝑖, 𝑔𝑖 = 𝑚
Outline
Introduction
Preliminaries
• Probabilistic models
Our proposed method
• Enhanced Bayesian Classifier Combination (EBCC)
Results
• Synthetic data
• Real-world data
Discussion
Synthetic data
Binary task, 2 subtypes per class
5 workers, subtype-level worker
accuracies shown in the table
All workers have labelled all items
Subtypes evenly distributed
Worker labels randomly
generated according to their
confusion matrices 27
Truth Subtype W1 W2 W3 W4 W5 %
0 0 0.9 0.9 0.7 0.7 0.7 25%
0 1 0.1 0.1 0.7 0.7 0.7 25%
1 0 0.9 0.1 0.7 0.7 0.7 25%
1 1 0.1 0.9 0.7 0.7 0.7 25%
Results on 17 real-world datasets
17 datasets
Coming from three crowdsourcing dataset
collections, namely the union of
Venanzi et al., (2015, AAAI) (8 datasets)
Zheng et al., (2017, VLDB) (7 datasets)
Zhang et al., (2014, NIPS) (5 datasets)
noting that 3 datasets are in common
between the last two collections.
28
10 benchmarks
MV
ZenCrowd (Demartini et al., 2012, WWW)
GLAD (Whitehill et al., 2009, NIPS)
DS (Dawid & Skene, 1979)
Minimax (Zhou et al., 2012, JMLR)
iBCC (Kim & Ghahramani, 2012, AISTATS)
CBCC (Venanzi et al., 2014, WWW)
LFC (Raykar et al., 2010, JMLR)
CATD (Li et al., 2014, VLDB)
CRH (Aydin et al., 2014, AAAI)
Results on 17 real-world datasets
EBCC(M=10) has the highest, 84.5%
the best existing method iBCC-MF, 83.4%
confusion-matrix-based models (DS, iBCC,
CBCC, LFC) perform similarly with mean
accuracy within range [82.9%, 83.4%]
followed by “1-coin” models, namely,
CATD (82.8%), GLAD (82.3%), ZC (82.2%)
Minimax and CRH fail catastrophically on
a few datasets
Discussion
Limitations
• EBCC performs worse than others on very noisy datasets
• Our mean-field VB implementation may get stuck on local optima
Thank you
Poster today @ Pacific Ballroom #240

More Related Content

Similar to Exploiting Worker Correlation for Label Aggregation in Crowdsourcing

Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
taikhoan262
 
Other classification methods in data mining
Other classification methods in data miningOther classification methods in data mining
Other classification methods in data mining
Kumar Deepak
 
Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
Stéphane Canu
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Sung Kim
 
Brown bag 2012_fall
Brown bag 2012_fallBrown bag 2012_fall
Brown bag 2012_fall
Xiaolei Zhou
 

Similar to Exploiting Worker Correlation for Label Aggregation in Crowdsourcing (20)

Data simulation basics
Data simulation basicsData simulation basics
Data simulation basics
 
Guide
GuideGuide
Guide
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysis
 
Other classification methods in data mining
Other classification methods in data miningOther classification methods in data mining
Other classification methods in data mining
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Blinkdb
BlinkdbBlinkdb
Blinkdb
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the theSupport Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
 
Interval programming
Interval programming Interval programming
Interval programming
 
LPP, Duality and Game Theory
LPP, Duality and Game TheoryLPP, Duality and Game Theory
LPP, Duality and Game Theory
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
 
Deep learning simplified
Deep learning simplifiedDeep learning simplified
Deep learning simplified
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
ilp-nlp-slides.pdf
ilp-nlp-slides.pdfilp-nlp-slides.pdf
ilp-nlp-slides.pdf
 
Brown bag 2012_fall
Brown bag 2012_fallBrown bag 2012_fall
Brown bag 2012_fall
 
Associative_Memory_Neural_Networks_pptx.pptx
Associative_Memory_Neural_Networks_pptx.pptxAssociative_Memory_Neural_Networks_pptx.pptx
Associative_Memory_Neural_Networks_pptx.pptx
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Exploiting Worker Correlation for Label Aggregation in Crowdsourcing

  • 1. Exploiting Worker Correlation for Label Aggregation in Crowdsourcing Yuan Li, Benjamin Rubinstein, Trevor Cohn
  • 2. Crowdsourcing Templates provided by FigureEight (formerly CrowdFlower)2
  • 3. We focus on aggregating discrete labels Templates provided by FigureEight (formerly CrowdFlower)3
  • 4. A matrix of worker labels Infer the truth… T T ? T F ? T ? T T ? F ? F F Truth ? ? ? ? ?
  • 5. Problem statement Assume there are 𝑊 workers who classify 𝑁 items into 𝐾 categories. Let 𝑧𝑖 be the latent true label of item 𝑖, 𝑦𝑖𝑗 the label that worker 𝑗 assigns to item 𝑖, 𝑊𝑖 the set of workers who have labelled item 𝑖. Goal: inferring the true labels 𝑍 = 𝑧𝑖 𝑖=1 𝑁 based on the observed worker labels 𝑌 = 𝑦𝑖𝑗 𝑗∈𝑊𝑖 𝑖=1 𝑁 5
  • 6. Outline Introduction Preliminaries • Probabilistic models Our proposed method • Enhanced Bayesian Classifier Combination (EBCC) Results • Synthetic data • Real-world data Discussion
  • 7. Crowdsourcing A matrix of worker labels Infer the truth… T T ? T F ? T ? T T ? F ? F F Truth ? ? ? ? ?
  • 8. Classifier combination A matrix of classifier predictions Infer the truth… T T F T F T T T T T F F F F F Truth ? ? ? ? ?
  • 9. Outline Introduction Preliminaries • Probabilistic models Our proposed method • Enhanced Bayesian Classifier Combination (EBCC) Results • Synthetic data • Real-world data Discussion
  • 10. All probabilistic aggregation models define 𝑝 𝑌, 𝑍 = ෑ 𝑖 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 𝑝(𝑧𝑖) 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 is parameterized by 𝑉 𝑉 captures worker reliability 𝑝 𝑌, 𝑍 𝑉 = ෑ 𝑖 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉 𝑝(𝑧𝑖) 10
  • 11. Independent BCC (iBCC) Models mainly differ in how 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 is parameterized by 𝑉 iBCC (Kim & Ghahramani, 2012) assumes conditional independence between 𝑦𝑖𝑗’s: 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෑ 𝑗 𝑝 𝑦𝑖𝑗 𝑧𝑖 where 𝑣𝑗𝑘𝑙 = 𝑝 𝑦𝑖𝑗 = 𝑙 𝑧𝑖 = 𝑘 • Easy to marginalize out unobserved 𝑦𝑖𝑗 • #paras in 𝑉 is 𝑂 𝑊𝐾2 14
  • 12. Dependent BCC (dBCC) dBCC (Kim & Ghahramani, 2012) uses a Markov field for 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 to capture the correlation between𝑦𝑖𝑗’s 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉, 𝑈 = 1 𝐶 𝑉, 𝑈, 𝑧𝑖 exp ෍ 1≤𝑗<𝑗′≤𝑊 𝑢𝑗𝑗′𝑦𝑖𝑗𝑦𝑖𝑗′ + ෍ 𝑗=1 𝑊 𝑣𝑗𝑧𝑖𝑦𝑖𝑗 However, it’s intractable to marginalize out unobserved 𝑦𝑖𝑗 and #params in V and U is 𝑂(𝑊2𝐾2) 16
  • 13. Outline Introduction Preliminaries • Probabilistic models Our proposed method • Enhanced Bayesian Classifier Combination (EBCC) Results • Synthetic data • Real-world data Discussion
  • 14. Worker A & B have labelled 20 items… 18 Class 0 0.8 0.1 0.1 0 ≈ 0.9 0.1 ⨂ 0.9 0.1 = 0.81 0.09 0.09 0.01 Conditional independence assumption Class 1 0.4 0.1 0.1 0.4 ≉ 0.5 0.5 ⨂ 0.5 0.5 = 0.25 0.25 0.25 0.25 Conditional independence assumption A B Truth 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 A B Truth 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 0 0 1 0 0 1 0 0 1 0 0 1
  • 15. Use 2 rank-1 matrices… 19 0.4 0.1 0.1 0.4 ≉ 0.5 0.5 ⨂ 0.5 0.5 = 0.25 0.25 0.25 0.25 0.4 0.1 0.1 0.4 ≈ 1 2 0.1 0.9 ⨂ 0.1 0.9 + 1 2 0.9 0.1 ⨂ 0.9 0.1 = 0.41 0.09 0.09 0.41
  • 16. There are 2 subtypes for class-1 items 20 0.4 0.1 0.1 0.4 ≉ 0.5 0.5 ⨂ 0.5 0.5 = 0.25 0.25 0.25 0.25 0.4 0.1 0.1 0.4 ≈ 1 2 0.1 0.9 ⨂ 0.1 0.9 + 1 2 0.9 0.1 ⨂ 0.9 0.1 = 0.41 0.09 0.09 0.41 In half of the time, A & B both have 90% accuracy In half of the time, A & B both have 10% accuracy They are labelling easy items They are labelling hard items
  • 17. Worker A & B have labelled 20 items… 21 Confusion matrix on class-level Confusion matrix on subtype-level A B Truth 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 A B Truth Subtype 1 1 1 Easy 1 1 1 Easy 1 1 1 Easy 1 1 1 Easy 1 0 1 ½ E + ½ H 0 1 1 ½ E + ½ H 0 0 1 Hard 0 0 1 Hard 0 0 1 Hard 0 0 1 Hard Truth Worker label 0 1 0 0.9 0.1 1 0.5 0.5 Truth Subtype Worker label 0 1 0 - 0.9 0.1 1 Easy 0.1 0.9 1 Hard 0.9 0.1
  • 18. In general, tensor rank decomposition can be used for modelling 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = 𝑘 = ෍ 𝑚 𝜋𝑘𝑚 ⋅ Ԧ 𝑣1𝑘𝑚 ⊗ Ԧ 𝑣2𝑘𝑚 ⊗ ⋯ ⊗ Ԧ 𝑣𝑊𝑘𝑚 Or in a probabilistic way 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෍ 𝑚 𝑝 𝑔𝑖 = 𝑚 𝑧𝑖 ⋅ ෑ 𝑗 𝑝 𝑦𝑖𝑗 𝑧𝑖, 𝑔𝑖 = 𝑚 𝑚 = 1 … 𝑀, where 𝑀 is the number of subtypes per class 𝑔𝑖 is the subtype of item 𝑖 under its class 𝑧𝑖 22
  • 20. More details about EBCC Mean-field variational Bayes Run multiple times, then select the solution with the highest ELBO Use Dir(0.1) as the prior for 𝜋𝑘 to encourage the model to use fewer subtypes 24
  • 21. Different ways to model 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 independent BCC (iBCC) 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෑ 𝑗 𝑝 𝑦𝑖𝑗 𝑧𝑖 dependent BCC (dBCC) 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖, 𝑉, 𝑈 = 1 𝐶 𝑉, 𝑈, 𝑧𝑖 exp ෍ 1≤𝑗<𝑗′≤𝑊 𝑢𝑗𝑗′𝑦𝑖𝑗𝑦𝑖𝑗′ + ෍ 𝑗=1 𝑊 𝑣𝑗𝑧𝑖𝑦𝑖𝑗 Our proposed enhanced BCC (EBCC) 𝑝 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑊 𝑧𝑖 = ෍ 𝑚 𝑝 𝑔𝑖 = 𝑚 𝑧𝑖 ⋅ ෑ 𝑗 𝑝 𝑦𝑖𝑗 𝑧𝑖, 𝑔𝑖 = 𝑚
  • 22. Outline Introduction Preliminaries • Probabilistic models Our proposed method • Enhanced Bayesian Classifier Combination (EBCC) Results • Synthetic data • Real-world data Discussion
  • 23. Synthetic data Binary task, 2 subtypes per class 5 workers, subtype-level worker accuracies shown in the table All workers have labelled all items Subtypes evenly distributed Worker labels randomly generated according to their confusion matrices 27 Truth Subtype W1 W2 W3 W4 W5 % 0 0 0.9 0.9 0.7 0.7 0.7 25% 0 1 0.1 0.1 0.7 0.7 0.7 25% 1 0 0.9 0.1 0.7 0.7 0.7 25% 1 1 0.1 0.9 0.7 0.7 0.7 25%
  • 24. Results on 17 real-world datasets 17 datasets Coming from three crowdsourcing dataset collections, namely the union of Venanzi et al., (2015, AAAI) (8 datasets) Zheng et al., (2017, VLDB) (7 datasets) Zhang et al., (2014, NIPS) (5 datasets) noting that 3 datasets are in common between the last two collections. 28 10 benchmarks MV ZenCrowd (Demartini et al., 2012, WWW) GLAD (Whitehill et al., 2009, NIPS) DS (Dawid & Skene, 1979) Minimax (Zhou et al., 2012, JMLR) iBCC (Kim & Ghahramani, 2012, AISTATS) CBCC (Venanzi et al., 2014, WWW) LFC (Raykar et al., 2010, JMLR) CATD (Li et al., 2014, VLDB) CRH (Aydin et al., 2014, AAAI)
  • 25. Results on 17 real-world datasets EBCC(M=10) has the highest, 84.5% the best existing method iBCC-MF, 83.4% confusion-matrix-based models (DS, iBCC, CBCC, LFC) perform similarly with mean accuracy within range [82.9%, 83.4%] followed by “1-coin” models, namely, CATD (82.8%), GLAD (82.3%), ZC (82.2%) Minimax and CRH fail catastrophically on a few datasets
  • 26. Discussion Limitations • EBCC performs worse than others on very noisy datasets • Our mean-field VB implementation may get stuck on local optima
  • 27. Thank you Poster today @ Pacific Ballroom #240