SlideShare a Scribd company logo
1 of 37
Download to read offline
Crowd Teaching with Imperfect Labels
Yao Zhou1, Arun Reddy Nelakurthi2, Ross Maciejewski3, Wei Fan4, Jingrui He1
1University of Illinois at Urbana Champaign, 2Samsung Research America,
3Arizona State University, 4Tencent America
- 2 -
The Surge of Crowdsourcing
q Need label information for training (semi-)supervised ML models
q Huge demand exists for fine-grained label information in real-world applications
o Fine-grained segmentation and localization in CV problems.
o Downstream finetuning tasks in NLP problems.
o AI assisted medical image and signal diagnosis problems.
Jacob Devlin, et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. EMNLP 2018
https://appen.com/blog/computer-vision-vs-machine-vision/
Computer Vision Natural Language Processing
- 3 -
Fine-Grained Crowdsourcing
q What we have and what we need:
o Not annotating data from scratch.
o Coarsely labeled data are available, but not perfect (contains incorrect labels).
o Objectives:
• Leverage label information from amateur workers to improve label quality.
• Teach crowd workers with imperfect labels and improve their labeling expertise.
q Existing solutions
o Conventional crowdsourcing models, see [1, 2, 3]
o Crowd teaching models, see [4, 5, 6]
- 4 -
Fine-Grained Crowdsourcing
q What we have and what we need:
o Not annotating data from scratch.
o Coarsely labeled data are available, but not perfect (contains incorrect labels).
o Objectives:
• Leverage label information from amateur workers to improve label quality.
• Teach crowd workers with imperfect labels and improve their labeling expertise.
q Existing solutions
o Conventional crowdsourcing models, see [1, 2, 3]
o Crowd teaching models, see [4, 5, 6]
- 5 -
Fine-Grained Crowdsourcing
q Issues with existing solutions
o Conventional crowdsourcing models only focusing on label quality:
• Downweight the weak workers & trust the good workers.
• Motivate the workers to convey their knowledge by designing good incentive systems.
Yao Zhou, et al., MultiC2: an Optimization Framework for Learning from Task and Worker Dual Heterogeneity. SDM 2016
Nihar B. Shah, et al., Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing. NIPS 2015.
Important fact being ignored: Human beings are good at learning a
concept & transferring the learned concept into similar learning tasks.
- 6 -
Fine-Grained Crowdsourcing
q What we have and what we need:
o Not annotating data from scratch.
o Coarsely labeled data are available, but not perfect (contains incorrect labels).
o Objectives:
• Leverage label information from amateur workers to improve label quality.
• Teach crowd workers with imperfect labels and improve their labeling expertise.
q Existing solutions
o Conventional crowdsourcing models, see [1, 2, 3]
o Crowd teaching models, see [4, 5, 6]
- 7 -
Fine-Grained Crowdsourcing
q Issues with existing solutions
o Crowd teaching models only focus on the labeling expertise of crowd workers:
• Require a perfectly labeled teaching set ! (e.g., hypothesis transition teaching models).
• Require a perfect target concept "* (e.g., iterative teaching models).
• Lacks of explanation for teaching samples.
Iterative crowd teaching with perfect target conceptCrowd teaching with perfect teaching set
Adish Singla, et al., Near-Optimally Teaching the Crowd to Classify. ICML 2014
Yao Zhou, et al., Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners, KDD 2018
- 8 -
Fine-Grained Crowdsourcing
q What we have and what we need:
o Not annotating data from scratch.
o Coarsely labeled data are available, but not perfect (contains incorrect labels).
o Objectives:
• Leverage label information from amateur workers to improve label quality.
• Teach crowd workers with imperfect labels and improve their labeling expertise.
q Existing solutions
o Conventional crowdsourcing models, see [1, 2, 3]
o Crowd teaching models, see [4, 5, 6]
q Research question
o Can we simultaneously improve both in a unified framework
?
- 9 -
Roadmap
q Introduction
o Problem setting
o Overview of the framework
q The proposed framework
o Learner model
o Explanation difficulty
o Teacher model
q Extensions
o Imperfect teaching with surrogate cost
o Curriculum learning with label influence
q Experiments
o Quantitative evaluation
o Qualitative evaluation
q Conclusion
- 10 -
Roadmap
q Introduction
o Problem setting
o Overview of the framework
q The proposed framework
o Learner model
o Explanation difficulty
o Teacher model
q Extensions
o Imperfect teaching with surrogate cost
o Curriculum learning with label influence
q Experiments
o Quantitative evaluation
o Qualitative evaluation
q Conclusion
- 11 -
Problem Setting
q Given:
o Imperfect labeled data set (mixture of correct labels & incorrect labels).
o Unlabeled data set .
o A prediction model (e.g., a classification model)
q Objective I: providing a higher-quality labeled data set that has
o Items originally belong to , but re-labeled and verified by workers
o Items originally belong to , but stay untouched.
o Items originally belong to , but get new labels from workers.
q Objective II: Using the imperfect prediction model as the teacher to teach the
workers learn to label with personalized teaching sequence and visual explanations.
- 12 -
Overview of the Framework
q Interactions between teacher and student
o First, the teacher, , recommends and shows an item to the learner. The learner
provides an initial label.
o Second, the teacher shows its probabilistic prediction and visual explanation. The
learner will update her labels and choose a trusted explanation.
o Third, a masked explanation will be provided to the learner based on her
confidence. Only the high confidence label will be recorded.
- 13 -
Roadmap
q Introduction
o Problem setting
o Overview of the framework
q The proposed framework
o Learner model
o Explanation difficulty
o Teacher model
q Extensions
o Imperfect teaching with surrogate cost
o Curriculum learning with label influence
q Experiments
o Quantitative evaluation
o Qualitative evaluation
q Conclusion
- 14 -
Adaptive Teaching and Learning
q Learner model
o The learners use gradient-based learning procedure to improve their concepts in an
iterative way:
o Each learner has an exponentially decayed retrievability of memory:
o The memory momentum can be rewritten as follows if we set initial to be !.
Learner’s concept at
t-th teaching iteration
Explanation difficulty coefficient
Memory momentum of learner
at t-th teaching iteration
Memory decay rate Index of teaching item in t-th iteration
Where is learning
loss of the learner
- 15 -
Adaptive Teaching and Learning
q Explanation difficulty
o An item (image or text) with a smaller attention
region (small area of pixels or a few key words)
would be easier to interpret.
o Explanation difficulty is defined as the entropy of
the generated explanation:
o The explanation re-scaling coefficient is defined as:
Explanation as a “domestic” cat by
highlighting the facial area
Re-scaling coefficient > 1 if an item
has non-uniform visual explanations.
- 16 -
q Teacher model
o Teacher aims at maximizing learner’s speed of convergence by minimizing the
distance of learner’s current concept from teacher’s empirical target concept
in two consecutive iterations:
Empirical target concept from the teacher
where is cost
function of the teacher
Adaptive Teaching and Learning
- 17 -
q Teacher model
o In each iteration, the teacher aims to recommending teaching item with its
explanation . Then, teaching becomes as a pool-based search problem:
o The candidate pool includes both and , thus, it allows the worker to re-label
or add new label to an item.
Adaptive Teaching and Learning
- 18 -
Roadmap
q Introduction
o Problem setting
o Overview of the framework
q The proposed framework
o Learner model
o Explanation difficulty
o Teacher model
q Extensions
o Imperfect teaching with surrogate cost
o Curriculum learning with label influence
q Experiments
o Quantitative evaluation
o Qualitative evaluation
q Conclusion
- 19 -
q Make the teaching model better.
o Is the teacher good enough to teach if it uses the empirical target concept
learned from imperfect data set ?
ü Yes, it is. If the surrogate cost function is used instead.
o Since is imperfect, therefore, the aggregated labels of items will usually include
errors. We define the class-conditional error rate of labels as:
surrogate cost function
original cost function
For short, we have:
Adaptive Teaching and Learning
- 20 -
q Make the teaching model better.
o How good is this empirical target concept if the surrogate cost function is used?
§ The empirical risk is bounded by the risk which uses the unknown ground truth
concept with probability at least .
o Can this bound become tighter?
ü Yes, it can. If the following condition is satisfied.
Error rate of learner provided labels Learner’s confident on true class label
Nagarajan Natarajan, et al. Learning with Noisy Labels. NIPS 2013.
Adaptive Teaching and Learning
- 21 -
q Make the teaching model better.
o How good is this empirical target concept if the surrogate cost function is used?
§ The empirical risk is bounded by the risk which uses the unknown ground truth
concept with probability at least .
o Can this bound become tighter?
ü Yes, it can. If the following condition is satisfied.
Error rate of labeled items
will decrease if learners
provide confident labels.
will gradually have
smaller upper bound and
get close to the risk that
uses true target concept .
Adaptive Teaching and Learning
- 22 -
Crowd Teaching with Imperfect Labels
q Make the teaching model better (continue)
o Is the teaching sequence good enough to fit the natural learning trend of human
learners?
Not really! Not all items have equal influence on the prediction model .
o Teaching should follow the principle of curriculum learning, i.e., the teaching
sequence is ranging from easy to difficult.
§ Easy items have small influence on since they are usually data points with large
marginal distance in feature space.
§ Difficult items have large influence because changing their labels would have large
impact on the behavior of the prediction model .
- 23 -
q Make the teaching model better (continue)
o The teaching sequence should have increasing influences, which is defined as the
model prediction’s change w.r.t. the label perturbation.
§ Item with label is
§ Item with label perturbation is
§ For simplicity, we have
o The empirical risk minimizer after replacing a small mass ! of original item z with z" is:
o The influence of upweighting item z with small mass ! is defined as:
Pang Wei Koh, Percy Liang. Understanding Black-box Predictions via Influence Functions. ICML 2017
Adaptive Teaching and Learning
- 24 -
q Make the teaching model better (continue)
o Then, the parameter change of perturbing label y is given as the difference of
influences between upweighting z! and upweighting z:
o With some simplification (first-order approximation, chain rule), we have the
perturbed loss on ztest of changing label y as:
Adaptive Teaching and Learning
- 25 -
q Make the teaching model better (continue)
o The influence of label perturbation has two cases:
§ Positive influence (perturb y from -1 to +1)
§ Negative influence (perturb y from +1 to -1)
o Then, the influence score of any item ztest is calculated using their absolute value:
o The overall goal of teaching becomes selecting a sequence of items that are both
effective toward concept learning as well as having increasing influences.
Teaching item index
Teaching score from obj. in Eq. (11)
Influence score
Adaptive Teaching and Learning
- 26 -
Discussion and Extensions
q Worker (learner) only has confidence on one category of items
o If one worker is only confident on the positive class (i.e., < , she only
gives confident positive labels)
o In this case, c+1>0 and c-1=0 and Theorem 4.1 can be easily satisfied. Therefore,
the updated label set still leads to a better prediction model.
q Teaching with starving prevention.
o If repeated labeling is allowed, the overall teaching score could be always high
for certain items. The low-score items will be starved and never recommended.
o The influence intensity is updated as the entropy of item xi’s label set
§ Low-entropy label set (e.g., ) will downgrade the
influence score faster than high-entropy label set (e.g., ).
- 27 -
Roadmap
q Introduction
o Problem setting
o Overview of the framework
q The proposed framework
o Learner model
o Explanation difficulty
o Teacher model
q Extensions
o Imperfect teaching with surrogate cost
o Curriculum learning with label influence
q Experiments
o Quantitative evaluation
o Qualitative evaluation
q Conclusion
- 28 -
Experiments
q Data sets
o Cat images (domestic vs. wild)
o Canidae images (domestic vs. wild)
o Text documents (comp. vs. sci.)
q Features
o Image features are from a finetuned ResNet34’s penultimate layer.
o Text features are TFIDF with certain standard document preprocessing.
q Explanations
o Image explanations are saliency maps generated by Grad-CAM.
o Text explanations are highlighted words generated by LIME.
q Class-conditional error
o Random flip labels with a fixed error rate.
q Human learners
o 61 trials, each learner is assigned with one teaching algorithm using Round-robin.
Ramprasaath Selvaraju, et al., Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. ICCV 2017
Marco Ribeiro, et al., Why Should I Trust You? Explaining the Predictions of Any Classifier. KDD 2016
- 29 -
q Teaching interface (image)
Experiments
- 30 -
q Teaching interface (text)
Experiments
- 31 -
q Quantitative results
o Teaching gain: labeling accuracy(after teaching) – labeling accuracy(before teaching)
JEDI: interactive teaching
w/o explanation.
VADER-lite: removes
confidence gauging in the 3rd
step of VADER.
VADER: our proposed three-
step teaching model.
Yao Zhou, et al., Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners, KDD 2018
Experiments
- 32 -
q Quantitative results
o Label retrieval rate: fraction of items with incorrect labels that have been corrected after
teaching.
Teacher: the initial prediction model.
Worker Initial: the workers without
receive any teaching.
JEDI: interactive teaching w/o
explanation.
VADER-lite: removes confidence
gauging in the 3rd step of VADER.
VADER: our proposed three-step
teaching model.
Improvement due to
confidence gauging
Experiments
- 33 -
q Quantitative results
o Model performance: retrain the model after teaching, compare the retrained prediction
model (using the MMCE aggregated labels) with the teacher’s performance.
Cat and Text have improved model
performance due to their high label
retrieval rate.
Canidae has barely observed any
improvement in model performance
because of its low label retrieval rate.
The label perturbation influence is not
high enough.
Dengyong Zhou, John Platt, Sumit Basu, and Yi Mao. Learning from the Wisdom of Crowds by Minimax Entropy. NIPS, 2012.
Experiments
- 34 -
q Qualitative results
o Results of the influence score. Top row are the high-influence canidae images and bottom
row are low-influence canidae images. Each image is described by a tuple composed of its
true category and indicator of error (1: label error, 0: no label error)
Experiments
- 35 -
Conclusion
q VADER model
ü Interactive learning and teaching.
ü Simultaneously improves label quality of data & labeling expertise of workers.
ü Does not require perfect teaching set and perfect target concept.
ü Has human interpretable explanations.
ü Theoretical connections between teaching and explanation.
- 36
1. Dengyong Zhou, John Platt, Sumit Basu, Yi Mao. Learning from the Wisdom of Crowds by Minimax Entropy. NIPS, 2012.
2. Vikas Raykar, Shipeng Yu, Linda H. Zhao, Gerardo Valadez, Charles Florin, Luca Bogoni, Linda Moy. Learning From Crowds. JMLR 2010.
3. Yao Zhou, Jingrui He. Crowdsourcing via Tensor Augmentation and Completion. IJCAI 2016.
4. Adish Singla, Ilija Bogunovic, Gábor Bartók, Amin Karbasi, Andreas Krause. Near Optimally Teaching the Crowd to Classify. ICML 2014.
5. Edward Johns, Oisin Mac Aodha, Gabriel J. Brostow. Becoming the expert - interactive multi-class machine teaching. CVPR 2015.
6. Yao Zhou, Arun Reddy Nelakurthi, Jingrui He. Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed
Memory Learners, KDD 2018
7. Yao Zhou, Lei Ying, Jingrui He, MultiC2: an Optimization Framework for Learning from Task and Worker Dual Heterogeneity. SDM 2016
8. Nihar B. Shah, Dengyong Zhou., Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing. NIPS 2015.
9. Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep Ravikumar, Ambuj Tewari. Learning with Noisy Labels. NIPS 2013.
10. Pang Wei Koh, Percy Liang. Understanding Black-box Predictions via Influence Functions. ICML 2017
11. Ramprasaath Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra. Grad-CAM: Visual
Explanations from Deep Networks via Gradient-Based Localization. ICCV 2017
12. Marco Ribeiro, Sameer Singh, Carlos Guestrin. Why Should I Trust You? Explaining the Predictions of Any Classifier. KDD 2016
References
- 37
Thank You!
Email me via yaozhou3@Illinois.edu if you have any questions!

More Related Content

Similar to Crowd Teaching with Imperfect Labels

Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdfJovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
MichaelClydeGurdiel
 
Increasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
Increasing Grade 9 Students TLE MPS using a Blended Approach with ManychatIncreasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
Increasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
ijtsrd
 
Performance Assessment
Performance AssessmentPerformance Assessment
Performance Assessment
Marsha Ratzel
 
Ch. 11 designing and conducting formative evaluations
Ch. 11 designing and conducting formative evaluationsCh. 11 designing and conducting formative evaluations
Ch. 11 designing and conducting formative evaluations
EzraGray1
 

Similar to Crowd Teaching with Imperfect Labels (20)

Providing the Spark for CCSS
Providing the Spark for CCSSProviding the Spark for CCSS
Providing the Spark for CCSS
 
C@n do nov 2013 priorities
C@n do nov 2013 prioritiesC@n do nov 2013 priorities
C@n do nov 2013 priorities
 
Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdfJovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
Jovielyn_R._Gerante_BSED3_ENGLISH.docx.pdf
 
A cluster-based analysis to diagnose students’ learning achievements
A cluster-based analysis to diagnose students’ learning achievementsA cluster-based analysis to diagnose students’ learning achievements
A cluster-based analysis to diagnose students’ learning achievements
 
Documenting Your Teaching Efforts in a Way that Counts
Documenting Your Teaching Efforts in a Way that CountsDocumenting Your Teaching Efforts in a Way that Counts
Documenting Your Teaching Efforts in a Way that Counts
 
How to Track your students with End of the Year Data by Noble Newman
How to Track your students with End of the Year Data by Noble NewmanHow to Track your students with End of the Year Data by Noble Newman
How to Track your students with End of the Year Data by Noble Newman
 
Increasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
Increasing Grade 9 Students TLE MPS using a Blended Approach with ManychatIncreasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
Increasing Grade 9 Students TLE MPS using a Blended Approach with Manychat
 
Ar cgi don
Ar cgi donAr cgi don
Ar cgi don
 
National Professional standards for Teachers in Pakistan
National Professional standards for Teachers in PakistanNational Professional standards for Teachers in Pakistan
National Professional standards for Teachers in Pakistan
 
Wsl005 006
Wsl005 006Wsl005 006
Wsl005 006
 
Proposal project paper
Proposal project paperProposal project paper
Proposal project paper
 
Planning
PlanningPlanning
Planning
 
ID Unit Report 3
ID Unit Report 3ID Unit Report 3
ID Unit Report 3
 
3 1-wesam
3 1-wesam3 1-wesam
3 1-wesam
 
HE Academy Leadership Seminar 21 Jan 2014 - Open Online Learning
HE Academy Leadership Seminar 21 Jan 2014 - Open Online LearningHE Academy Leadership Seminar 21 Jan 2014 - Open Online Learning
HE Academy Leadership Seminar 21 Jan 2014 - Open Online Learning
 
Academics Qpi
Academics QpiAcademics Qpi
Academics Qpi
 
Performance Assessment
Performance AssessmentPerformance Assessment
Performance Assessment
 
Defining Adaptive Learning Technology
Defining Adaptive Learning TechnologyDefining Adaptive Learning Technology
Defining Adaptive Learning Technology
 
Targeted resume study
Targeted resume studyTargeted resume study
Targeted resume study
 
Ch. 11 designing and conducting formative evaluations
Ch. 11 designing and conducting formative evaluationsCh. 11 designing and conducting formative evaluations
Ch. 11 designing and conducting formative evaluations
 

Recently uploaded

Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Cherry
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Cherry
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Cherry
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
Cherry
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Recently uploaded (20)

Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptxCONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Method of Quantifying interactions and its types
Method of Quantifying interactions and its typesMethod of Quantifying interactions and its types
Method of Quantifying interactions and its types
 
GBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) MetabolismGBSN - Biochemistry (Unit 3) Metabolism
GBSN - Biochemistry (Unit 3) Metabolism
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
Fourth quarter science 9-Kinetic-and-Potential-Energy.pptx
Fourth quarter science 9-Kinetic-and-Potential-Energy.pptxFourth quarter science 9-Kinetic-and-Potential-Energy.pptx
Fourth quarter science 9-Kinetic-and-Potential-Energy.pptx
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdf
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

Crowd Teaching with Imperfect Labels

  • 1. Crowd Teaching with Imperfect Labels Yao Zhou1, Arun Reddy Nelakurthi2, Ross Maciejewski3, Wei Fan4, Jingrui He1 1University of Illinois at Urbana Champaign, 2Samsung Research America, 3Arizona State University, 4Tencent America
  • 2. - 2 - The Surge of Crowdsourcing q Need label information for training (semi-)supervised ML models q Huge demand exists for fine-grained label information in real-world applications o Fine-grained segmentation and localization in CV problems. o Downstream finetuning tasks in NLP problems. o AI assisted medical image and signal diagnosis problems. Jacob Devlin, et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. EMNLP 2018 https://appen.com/blog/computer-vision-vs-machine-vision/ Computer Vision Natural Language Processing
  • 3. - 3 - Fine-Grained Crowdsourcing q What we have and what we need: o Not annotating data from scratch. o Coarsely labeled data are available, but not perfect (contains incorrect labels). o Objectives: • Leverage label information from amateur workers to improve label quality. • Teach crowd workers with imperfect labels and improve their labeling expertise. q Existing solutions o Conventional crowdsourcing models, see [1, 2, 3] o Crowd teaching models, see [4, 5, 6]
  • 4. - 4 - Fine-Grained Crowdsourcing q What we have and what we need: o Not annotating data from scratch. o Coarsely labeled data are available, but not perfect (contains incorrect labels). o Objectives: • Leverage label information from amateur workers to improve label quality. • Teach crowd workers with imperfect labels and improve their labeling expertise. q Existing solutions o Conventional crowdsourcing models, see [1, 2, 3] o Crowd teaching models, see [4, 5, 6]
  • 5. - 5 - Fine-Grained Crowdsourcing q Issues with existing solutions o Conventional crowdsourcing models only focusing on label quality: • Downweight the weak workers & trust the good workers. • Motivate the workers to convey their knowledge by designing good incentive systems. Yao Zhou, et al., MultiC2: an Optimization Framework for Learning from Task and Worker Dual Heterogeneity. SDM 2016 Nihar B. Shah, et al., Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing. NIPS 2015. Important fact being ignored: Human beings are good at learning a concept & transferring the learned concept into similar learning tasks.
  • 6. - 6 - Fine-Grained Crowdsourcing q What we have and what we need: o Not annotating data from scratch. o Coarsely labeled data are available, but not perfect (contains incorrect labels). o Objectives: • Leverage label information from amateur workers to improve label quality. • Teach crowd workers with imperfect labels and improve their labeling expertise. q Existing solutions o Conventional crowdsourcing models, see [1, 2, 3] o Crowd teaching models, see [4, 5, 6]
  • 7. - 7 - Fine-Grained Crowdsourcing q Issues with existing solutions o Crowd teaching models only focus on the labeling expertise of crowd workers: • Require a perfectly labeled teaching set ! (e.g., hypothesis transition teaching models). • Require a perfect target concept "* (e.g., iterative teaching models). • Lacks of explanation for teaching samples. Iterative crowd teaching with perfect target conceptCrowd teaching with perfect teaching set Adish Singla, et al., Near-Optimally Teaching the Crowd to Classify. ICML 2014 Yao Zhou, et al., Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners, KDD 2018
  • 8. - 8 - Fine-Grained Crowdsourcing q What we have and what we need: o Not annotating data from scratch. o Coarsely labeled data are available, but not perfect (contains incorrect labels). o Objectives: • Leverage label information from amateur workers to improve label quality. • Teach crowd workers with imperfect labels and improve their labeling expertise. q Existing solutions o Conventional crowdsourcing models, see [1, 2, 3] o Crowd teaching models, see [4, 5, 6] q Research question o Can we simultaneously improve both in a unified framework ?
  • 9. - 9 - Roadmap q Introduction o Problem setting o Overview of the framework q The proposed framework o Learner model o Explanation difficulty o Teacher model q Extensions o Imperfect teaching with surrogate cost o Curriculum learning with label influence q Experiments o Quantitative evaluation o Qualitative evaluation q Conclusion
  • 10. - 10 - Roadmap q Introduction o Problem setting o Overview of the framework q The proposed framework o Learner model o Explanation difficulty o Teacher model q Extensions o Imperfect teaching with surrogate cost o Curriculum learning with label influence q Experiments o Quantitative evaluation o Qualitative evaluation q Conclusion
  • 11. - 11 - Problem Setting q Given: o Imperfect labeled data set (mixture of correct labels & incorrect labels). o Unlabeled data set . o A prediction model (e.g., a classification model) q Objective I: providing a higher-quality labeled data set that has o Items originally belong to , but re-labeled and verified by workers o Items originally belong to , but stay untouched. o Items originally belong to , but get new labels from workers. q Objective II: Using the imperfect prediction model as the teacher to teach the workers learn to label with personalized teaching sequence and visual explanations.
  • 12. - 12 - Overview of the Framework q Interactions between teacher and student o First, the teacher, , recommends and shows an item to the learner. The learner provides an initial label. o Second, the teacher shows its probabilistic prediction and visual explanation. The learner will update her labels and choose a trusted explanation. o Third, a masked explanation will be provided to the learner based on her confidence. Only the high confidence label will be recorded.
  • 13. - 13 - Roadmap q Introduction o Problem setting o Overview of the framework q The proposed framework o Learner model o Explanation difficulty o Teacher model q Extensions o Imperfect teaching with surrogate cost o Curriculum learning with label influence q Experiments o Quantitative evaluation o Qualitative evaluation q Conclusion
  • 14. - 14 - Adaptive Teaching and Learning q Learner model o The learners use gradient-based learning procedure to improve their concepts in an iterative way: o Each learner has an exponentially decayed retrievability of memory: o The memory momentum can be rewritten as follows if we set initial to be !. Learner’s concept at t-th teaching iteration Explanation difficulty coefficient Memory momentum of learner at t-th teaching iteration Memory decay rate Index of teaching item in t-th iteration Where is learning loss of the learner
  • 15. - 15 - Adaptive Teaching and Learning q Explanation difficulty o An item (image or text) with a smaller attention region (small area of pixels or a few key words) would be easier to interpret. o Explanation difficulty is defined as the entropy of the generated explanation: o The explanation re-scaling coefficient is defined as: Explanation as a “domestic” cat by highlighting the facial area Re-scaling coefficient > 1 if an item has non-uniform visual explanations.
  • 16. - 16 - q Teacher model o Teacher aims at maximizing learner’s speed of convergence by minimizing the distance of learner’s current concept from teacher’s empirical target concept in two consecutive iterations: Empirical target concept from the teacher where is cost function of the teacher Adaptive Teaching and Learning
  • 17. - 17 - q Teacher model o In each iteration, the teacher aims to recommending teaching item with its explanation . Then, teaching becomes as a pool-based search problem: o The candidate pool includes both and , thus, it allows the worker to re-label or add new label to an item. Adaptive Teaching and Learning
  • 18. - 18 - Roadmap q Introduction o Problem setting o Overview of the framework q The proposed framework o Learner model o Explanation difficulty o Teacher model q Extensions o Imperfect teaching with surrogate cost o Curriculum learning with label influence q Experiments o Quantitative evaluation o Qualitative evaluation q Conclusion
  • 19. - 19 - q Make the teaching model better. o Is the teacher good enough to teach if it uses the empirical target concept learned from imperfect data set ? ü Yes, it is. If the surrogate cost function is used instead. o Since is imperfect, therefore, the aggregated labels of items will usually include errors. We define the class-conditional error rate of labels as: surrogate cost function original cost function For short, we have: Adaptive Teaching and Learning
  • 20. - 20 - q Make the teaching model better. o How good is this empirical target concept if the surrogate cost function is used? § The empirical risk is bounded by the risk which uses the unknown ground truth concept with probability at least . o Can this bound become tighter? ü Yes, it can. If the following condition is satisfied. Error rate of learner provided labels Learner’s confident on true class label Nagarajan Natarajan, et al. Learning with Noisy Labels. NIPS 2013. Adaptive Teaching and Learning
  • 21. - 21 - q Make the teaching model better. o How good is this empirical target concept if the surrogate cost function is used? § The empirical risk is bounded by the risk which uses the unknown ground truth concept with probability at least . o Can this bound become tighter? ü Yes, it can. If the following condition is satisfied. Error rate of labeled items will decrease if learners provide confident labels. will gradually have smaller upper bound and get close to the risk that uses true target concept . Adaptive Teaching and Learning
  • 22. - 22 - Crowd Teaching with Imperfect Labels q Make the teaching model better (continue) o Is the teaching sequence good enough to fit the natural learning trend of human learners? Not really! Not all items have equal influence on the prediction model . o Teaching should follow the principle of curriculum learning, i.e., the teaching sequence is ranging from easy to difficult. § Easy items have small influence on since they are usually data points with large marginal distance in feature space. § Difficult items have large influence because changing their labels would have large impact on the behavior of the prediction model .
  • 23. - 23 - q Make the teaching model better (continue) o The teaching sequence should have increasing influences, which is defined as the model prediction’s change w.r.t. the label perturbation. § Item with label is § Item with label perturbation is § For simplicity, we have o The empirical risk minimizer after replacing a small mass ! of original item z with z" is: o The influence of upweighting item z with small mass ! is defined as: Pang Wei Koh, Percy Liang. Understanding Black-box Predictions via Influence Functions. ICML 2017 Adaptive Teaching and Learning
  • 24. - 24 - q Make the teaching model better (continue) o Then, the parameter change of perturbing label y is given as the difference of influences between upweighting z! and upweighting z: o With some simplification (first-order approximation, chain rule), we have the perturbed loss on ztest of changing label y as: Adaptive Teaching and Learning
  • 25. - 25 - q Make the teaching model better (continue) o The influence of label perturbation has two cases: § Positive influence (perturb y from -1 to +1) § Negative influence (perturb y from +1 to -1) o Then, the influence score of any item ztest is calculated using their absolute value: o The overall goal of teaching becomes selecting a sequence of items that are both effective toward concept learning as well as having increasing influences. Teaching item index Teaching score from obj. in Eq. (11) Influence score Adaptive Teaching and Learning
  • 26. - 26 - Discussion and Extensions q Worker (learner) only has confidence on one category of items o If one worker is only confident on the positive class (i.e., < , she only gives confident positive labels) o In this case, c+1>0 and c-1=0 and Theorem 4.1 can be easily satisfied. Therefore, the updated label set still leads to a better prediction model. q Teaching with starving prevention. o If repeated labeling is allowed, the overall teaching score could be always high for certain items. The low-score items will be starved and never recommended. o The influence intensity is updated as the entropy of item xi’s label set § Low-entropy label set (e.g., ) will downgrade the influence score faster than high-entropy label set (e.g., ).
  • 27. - 27 - Roadmap q Introduction o Problem setting o Overview of the framework q The proposed framework o Learner model o Explanation difficulty o Teacher model q Extensions o Imperfect teaching with surrogate cost o Curriculum learning with label influence q Experiments o Quantitative evaluation o Qualitative evaluation q Conclusion
  • 28. - 28 - Experiments q Data sets o Cat images (domestic vs. wild) o Canidae images (domestic vs. wild) o Text documents (comp. vs. sci.) q Features o Image features are from a finetuned ResNet34’s penultimate layer. o Text features are TFIDF with certain standard document preprocessing. q Explanations o Image explanations are saliency maps generated by Grad-CAM. o Text explanations are highlighted words generated by LIME. q Class-conditional error o Random flip labels with a fixed error rate. q Human learners o 61 trials, each learner is assigned with one teaching algorithm using Round-robin. Ramprasaath Selvaraju, et al., Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. ICCV 2017 Marco Ribeiro, et al., Why Should I Trust You? Explaining the Predictions of Any Classifier. KDD 2016
  • 29. - 29 - q Teaching interface (image) Experiments
  • 30. - 30 - q Teaching interface (text) Experiments
  • 31. - 31 - q Quantitative results o Teaching gain: labeling accuracy(after teaching) – labeling accuracy(before teaching) JEDI: interactive teaching w/o explanation. VADER-lite: removes confidence gauging in the 3rd step of VADER. VADER: our proposed three- step teaching model. Yao Zhou, et al., Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners, KDD 2018 Experiments
  • 32. - 32 - q Quantitative results o Label retrieval rate: fraction of items with incorrect labels that have been corrected after teaching. Teacher: the initial prediction model. Worker Initial: the workers without receive any teaching. JEDI: interactive teaching w/o explanation. VADER-lite: removes confidence gauging in the 3rd step of VADER. VADER: our proposed three-step teaching model. Improvement due to confidence gauging Experiments
  • 33. - 33 - q Quantitative results o Model performance: retrain the model after teaching, compare the retrained prediction model (using the MMCE aggregated labels) with the teacher’s performance. Cat and Text have improved model performance due to their high label retrieval rate. Canidae has barely observed any improvement in model performance because of its low label retrieval rate. The label perturbation influence is not high enough. Dengyong Zhou, John Platt, Sumit Basu, and Yi Mao. Learning from the Wisdom of Crowds by Minimax Entropy. NIPS, 2012. Experiments
  • 34. - 34 - q Qualitative results o Results of the influence score. Top row are the high-influence canidae images and bottom row are low-influence canidae images. Each image is described by a tuple composed of its true category and indicator of error (1: label error, 0: no label error) Experiments
  • 35. - 35 - Conclusion q VADER model ü Interactive learning and teaching. ü Simultaneously improves label quality of data & labeling expertise of workers. ü Does not require perfect teaching set and perfect target concept. ü Has human interpretable explanations. ü Theoretical connections between teaching and explanation.
  • 36. - 36 1. Dengyong Zhou, John Platt, Sumit Basu, Yi Mao. Learning from the Wisdom of Crowds by Minimax Entropy. NIPS, 2012. 2. Vikas Raykar, Shipeng Yu, Linda H. Zhao, Gerardo Valadez, Charles Florin, Luca Bogoni, Linda Moy. Learning From Crowds. JMLR 2010. 3. Yao Zhou, Jingrui He. Crowdsourcing via Tensor Augmentation and Completion. IJCAI 2016. 4. Adish Singla, Ilija Bogunovic, Gábor Bartók, Amin Karbasi, Andreas Krause. Near Optimally Teaching the Crowd to Classify. ICML 2014. 5. Edward Johns, Oisin Mac Aodha, Gabriel J. Brostow. Becoming the expert - interactive multi-class machine teaching. CVPR 2015. 6. Yao Zhou, Arun Reddy Nelakurthi, Jingrui He. Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners, KDD 2018 7. Yao Zhou, Lei Ying, Jingrui He, MultiC2: an Optimization Framework for Learning from Task and Worker Dual Heterogeneity. SDM 2016 8. Nihar B. Shah, Dengyong Zhou., Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing. NIPS 2015. 9. Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep Ravikumar, Ambuj Tewari. Learning with Noisy Labels. NIPS 2013. 10. Pang Wei Koh, Percy Liang. Understanding Black-box Predictions via Influence Functions. ICML 2017 11. Ramprasaath Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. ICCV 2017 12. Marco Ribeiro, Sameer Singh, Carlos Guestrin. Why Should I Trust You? Explaining the Predictions of Any Classifier. KDD 2016 References
  • 37. - 37 Thank You! Email me via yaozhou3@Illinois.edu if you have any questions!