SlideShare a Scribd company logo
Noisy label(er)s
overview summary from
https://archive.nyu.edu/jspui/bitstream/2451/29799/2/CeDER-10-03.pdf (Iperiotis P.,Provost F. et al. 2010-09)
and other papers..
R. Kiriukhin
Rationale, setting and questions.
Rationale: (#CostSensitiveLearning)
● Cost = Data+Features+Labels(ground truth)
Setting: (not “Active Learning” where “cost(labels)>cost(obtaining new data)”)
● Cost of labeling: Low (Mechanical turk)
● Labels quality: Low (noisy, not ground truth)
● Cost of preparing new data, getting new unlabeled samples: High
Questions:
● Can repeated labeling help?
● What is expected improvement?
● What is the best way to do it?
Dirty labels, are they even a problem?
“Mushrooms”
dataset:
● Garbage in =
garbage out
● Max
achievable
performance =
F(quality)
● d(Performance
)/d(DataSize) =
G(quality)
active learning
improving q
Approach for improving q: majority voting.
Majority voting approach:
● Collect more than 1 label per sample from different labelers (j) (hence “repeated labels”).
● Labelers have a quality: Pr(yij = yi)
● Apply majority voting (giving “integral quality”)
Assumptions:
● Pr(yij = yi |xi) = Pr(yij = yi) = pj (all samples yielding the same probability of mistake for given labeler,
there are no “easy” and “hard” cases)
●
Approach for improving q: soft labels.
“Soft labels” approach:
● <same as for majority voting>
● multiplied examples (ME):
○ for every different yi from L(x)={y1,..yn} inject a new copy of x with this yi into the training set with
weight w(x,yi) = |L(x)=yi|/n
Assumptions:
● <same as for majority>
● More labelers = higher quality
○ if pj>0.5
● Marginal q improvement is
not linear from # of labelers
○ For p=0.9 there’s a little
benefit from going from
3 to 11 labelers
○ For p=0.7 going from 1
to 3 labelers gives q
improvement 0.1 which
gives 0.1..0.2
improvement in
ROC-AUC (moving from
q=0.7 curve to q=0.8
curve)
● if pj!=pi then sometimes it is
better using one best labeler
When it makes sense?
● When repeated-labeling should be chosen for modeling?
● How cheap (relatively speaking) does labeling have to be?
● For a given cost setting, is repeated-labeling much better or only marginally better?
● Can selectively choosing data points to label improve performance?
Empirical analysis to answer these questions:
● 8 datasets with, k-fold AUC>0.7 and binary response
● 30% as a hold out (test set)
● 70% as a pool for unlabeled and labeled data
● “noising” of labels in labeled data modelled according to pj
Decision: label/acquire
● Design choices:
○ Choice of the next sample to (re)label
○ Use “hard” label with majority voting or “soft” labels approach
● Basic strategies:
○ single labeling (SL):
■ get more examples with a single label each
○ fixed round-robin (FRR):
■ keep adding labels to a fixed number of examples, until exhausting our labeling budget
○ generalized round-robin (GRR):
■ give the next label to the example with the fewest labels
○
SL
FRR
● SL vs FRR:
○ under noisy labels, the
fixed round-robin
repeated labeling FRR
can perform better than
single-labeling when
there are enough
training examples, i.e.,
after the learning curves
are not so steep
With cost introduced, choice:
● acquiring a new training example for cost CU +
CL, (CU for the unlabeled portion, and CL for
the label)
● get another label for an existing example for
cost CL
Units for x axis = Data Acquisition cost = CD:
● ρ = CU/CL
● k = labels per sample for GRR
● CD = CU · Tr + CL · NL = = ρ · CL · Tr + k · CL · Tr
∝ ρ + k
SL vs GRR(majority):
● As the cost ratio ρ increases, the improvement
of GRR over SL also increases. So when the
labels actually are cheap, we can actually get
significant advantage from using a repeated
labeling strategy, such as GRR.
ME vs MV:
● the uncertainty-preserving repeated-labeling ME
outperforms MV in all cases, to greater or lesser degrees
● when labeling quality is substantially higher (e.g., p = 0.8),
repeated-labeling still is increasingly preferable to single
labeling as ρ increases; however, we no longer see an
advantage for ME over MV
●
Decision: which sample to relabel
With ENTROPY most
of the labeling
resources are wasted,
with the procedure
labeling a small set of
examples very many
times
Bayesian estimate of label uncertainty
(LU):
● Bayesian estimation of the
probability that ym is incorrect
○ uniform prior on pj
○ posterior is Beta(n+1,p+1)
● Uncertainty = CDF at the
decision threshold of the Beta
distribution, which is given by the
regularized incomplete beta
function
●
NLU:
● Bayesian estimation of the label
given p and n.
●
● NLU = 1 − Pr(+|p,n)
NLU vs LU for separating correctly
and incorrectly labeled samples
Model Uncertainty (MU):
● ignores the current multiset of
labels. It learns a set of models,
each of which predicts the
probability of class membership.
●
Label and Model uncertainty:
●
MUCV
● MU with H trained on k-folds
(CV)
MUO
● (O)racle. MU with H trained on all
perfect data
Decision: which sample to relabel
- influence on model quality
● Overall, combining label and model
uncertainty (LMU and NLMU)
produce the best approaches.
soft labels + selective relabeling?
● “soft-labeling is a strategy to
consider in environments
with high noise and when
using basic round-robin
labeling strategies. When
selective labeling is
employed, the benefits of
using soft-labeling
apparently diminish, and so
far we do not have the
evidence to recommend
using soft-labeling.”
weighted sampling?
● “The three selective
repeated-labeling strategies
with deterministic selection
order perform significantly
better than the ones with
weighted sampling”
Related
> Rizos G., Schuller B. W. Average Jane, Where Art Thou?–Recent Avenues in Efficient Machine Learning Under Subjectivity
Uncertainty. – 2020.
● summary of approaches to optimally learn in case when actual ‘ground truth’ may not be available.
> Fredriksson T. et al. Data Labeling: An Empirical Investigation into Industrial Challenges and Mitigation Strategies. – 2020.
● This study investigates the challenges that companies experience when annotating and labeling their data
> Raykar V. C. et al. Learning from crowds. – 2010.
● The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels
> Whitehill J. et al. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. – 2009.
● we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler,
and the difficulty of each image
> Zhou, Dengyong, et al. "Regularized minimax conditional entropy for crowdsourcing.". (2015).
● minimax conditional entropy principle to infer ground truth from noisy crowdsourced labels
> Zhao, Liyue, Gita Sukthankar, and Rahul Sukthankar. "Incremental relabeling for active learning with noisy crowdsourced
annotations." 2011.
● Unfortunately, most active learning strategies are myopic and sensitive to label noise, which leads to poorly trained
classifiers. We propose an active learning method that is specifically designed to be robust to such noise
> Dawid, Alexander Philip, and Allan M. Skene. "Maximum likelihood estimation of observer error‐rates using the EM algorithm."
1979.
● A model is presented which allows individual error‐rates to be estimated for polytomous facets even when the patient's
“true” response is not available. The EM algorithm is shown to provide a slow but sure way of obtaining maximum
likelihood estimates of the parameters of interest.
> Chen, Xi, et al. "Pairwise ranking aggregation in a crowdsourced setting." 2013.
● In contrast to traditional ranking aggregation methods, the approach learns about and folds into consideration the quality
of contributions of each annotator.
Thank you!

More Related Content

Similar to Noisy labels

Strong Baselines for Neural Semi-supervised Learning under Domain Shift
Strong Baselines for Neural Semi-supervised Learning under Domain ShiftStrong Baselines for Neural Semi-supervised Learning under Domain Shift
Strong Baselines for Neural Semi-supervised Learning under Domain Shift
Sebastian Ruder
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Arithmer Inc.
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsChirag Gupta
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
Luis Borbon
 
Decision tree
Decision tree Decision tree
Decision tree
Learnbay Datascience
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data萍華 楊
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
GrubhubTech
 
A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...
Phuong Dx
 
Design of experiments-Box behnken design
Design of experiments-Box behnken designDesign of experiments-Box behnken design
Design of experiments-Box behnken design
Gulamhushen Sipai
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
Sebastian Ruder
 
Data science
Data scienceData science
Data science
S. M. Akash
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)
Greg Landrum
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
PEMF2_SDM_2012_Ali
PEMF2_SDM_2012_AliPEMF2_SDM_2012_Ali
PEMF2_SDM_2012_Ali
MDO_Lab
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautzbutest
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautzbutest
 

Similar to Noisy labels (20)

Strong Baselines for Neural Semi-supervised Learning under Domain Shift
Strong Baselines for Neural Semi-supervised Learning under Domain ShiftStrong Baselines for Neural Semi-supervised Learning under Domain Shift
Strong Baselines for Neural Semi-supervised Learning under Domain Shift
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional Experts
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
Decision tree
Decision tree Decision tree
Decision tree
 
FinalReport
FinalReportFinalReport
FinalReport
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
 
A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...
 
Design of experiments-Box behnken design
Design of experiments-Box behnken designDesign of experiments-Box behnken design
Design of experiments-Box behnken design
 
20100822 computervision boykov
20100822 computervision boykov20100822 computervision boykov
20100822 computervision boykov
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
Data science
Data scienceData science
Data science
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
PEMF2_SDM_2012_Ali
PEMF2_SDM_2012_AliPEMF2_SDM_2012_Ali
PEMF2_SDM_2012_Ali
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 

Recently uploaded

社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Noisy labels

  • 1. Noisy label(er)s overview summary from https://archive.nyu.edu/jspui/bitstream/2451/29799/2/CeDER-10-03.pdf (Iperiotis P.,Provost F. et al. 2010-09) and other papers.. R. Kiriukhin
  • 2. Rationale, setting and questions. Rationale: (#CostSensitiveLearning) ● Cost = Data+Features+Labels(ground truth) Setting: (not “Active Learning” where “cost(labels)>cost(obtaining new data)”) ● Cost of labeling: Low (Mechanical turk) ● Labels quality: Low (noisy, not ground truth) ● Cost of preparing new data, getting new unlabeled samples: High Questions: ● Can repeated labeling help? ● What is expected improvement? ● What is the best way to do it?
  • 3. Dirty labels, are they even a problem? “Mushrooms” dataset: ● Garbage in = garbage out ● Max achievable performance = F(quality) ● d(Performance )/d(DataSize) = G(quality) active learning improving q
  • 4. Approach for improving q: majority voting. Majority voting approach: ● Collect more than 1 label per sample from different labelers (j) (hence “repeated labels”). ● Labelers have a quality: Pr(yij = yi) ● Apply majority voting (giving “integral quality”) Assumptions: ● Pr(yij = yi |xi) = Pr(yij = yi) = pj (all samples yielding the same probability of mistake for given labeler, there are no “easy” and “hard” cases) ●
  • 5. Approach for improving q: soft labels. “Soft labels” approach: ● <same as for majority voting> ● multiplied examples (ME): ○ for every different yi from L(x)={y1,..yn} inject a new copy of x with this yi into the training set with weight w(x,yi) = |L(x)=yi|/n Assumptions: ● <same as for majority>
  • 6. ● More labelers = higher quality ○ if pj>0.5 ● Marginal q improvement is not linear from # of labelers ○ For p=0.9 there’s a little benefit from going from 3 to 11 labelers ○ For p=0.7 going from 1 to 3 labelers gives q improvement 0.1 which gives 0.1..0.2 improvement in ROC-AUC (moving from q=0.7 curve to q=0.8 curve) ● if pj!=pi then sometimes it is better using one best labeler
  • 7. When it makes sense? ● When repeated-labeling should be chosen for modeling? ● How cheap (relatively speaking) does labeling have to be? ● For a given cost setting, is repeated-labeling much better or only marginally better? ● Can selectively choosing data points to label improve performance? Empirical analysis to answer these questions: ● 8 datasets with, k-fold AUC>0.7 and binary response ● 30% as a hold out (test set) ● 70% as a pool for unlabeled and labeled data ● “noising” of labels in labeled data modelled according to pj
  • 8. Decision: label/acquire ● Design choices: ○ Choice of the next sample to (re)label ○ Use “hard” label with majority voting or “soft” labels approach ● Basic strategies: ○ single labeling (SL): ■ get more examples with a single label each ○ fixed round-robin (FRR): ■ keep adding labels to a fixed number of examples, until exhausting our labeling budget ○ generalized round-robin (GRR): ■ give the next label to the example with the fewest labels ○
  • 10. ● SL vs FRR: ○ under noisy labels, the fixed round-robin repeated labeling FRR can perform better than single-labeling when there are enough training examples, i.e., after the learning curves are not so steep
  • 11. With cost introduced, choice: ● acquiring a new training example for cost CU + CL, (CU for the unlabeled portion, and CL for the label) ● get another label for an existing example for cost CL Units for x axis = Data Acquisition cost = CD: ● ρ = CU/CL ● k = labels per sample for GRR ● CD = CU · Tr + CL · NL = = ρ · CL · Tr + k · CL · Tr ∝ ρ + k SL vs GRR(majority): ● As the cost ratio ρ increases, the improvement of GRR over SL also increases. So when the labels actually are cheap, we can actually get significant advantage from using a repeated labeling strategy, such as GRR.
  • 12. ME vs MV: ● the uncertainty-preserving repeated-labeling ME outperforms MV in all cases, to greater or lesser degrees ● when labeling quality is substantially higher (e.g., p = 0.8), repeated-labeling still is increasingly preferable to single labeling as ρ increases; however, we no longer see an advantage for ME over MV ●
  • 13. Decision: which sample to relabel With ENTROPY most of the labeling resources are wasted, with the procedure labeling a small set of examples very many times
  • 14. Bayesian estimate of label uncertainty (LU): ● Bayesian estimation of the probability that ym is incorrect ○ uniform prior on pj ○ posterior is Beta(n+1,p+1) ● Uncertainty = CDF at the decision threshold of the Beta distribution, which is given by the regularized incomplete beta function ●
  • 15. NLU: ● Bayesian estimation of the label given p and n. ● ● NLU = 1 − Pr(+|p,n)
  • 16. NLU vs LU for separating correctly and incorrectly labeled samples
  • 17. Model Uncertainty (MU): ● ignores the current multiset of labels. It learns a set of models, each of which predicts the probability of class membership. ● Label and Model uncertainty: ● MUCV ● MU with H trained on k-folds (CV) MUO ● (O)racle. MU with H trained on all perfect data
  • 18. Decision: which sample to relabel - influence on model quality ● Overall, combining label and model uncertainty (LMU and NLMU) produce the best approaches.
  • 19. soft labels + selective relabeling? ● “soft-labeling is a strategy to consider in environments with high noise and when using basic round-robin labeling strategies. When selective labeling is employed, the benefits of using soft-labeling apparently diminish, and so far we do not have the evidence to recommend using soft-labeling.”
  • 20. weighted sampling? ● “The three selective repeated-labeling strategies with deterministic selection order perform significantly better than the ones with weighted sampling”
  • 22. > Rizos G., Schuller B. W. Average Jane, Where Art Thou?–Recent Avenues in Efficient Machine Learning Under Subjectivity Uncertainty. – 2020. ● summary of approaches to optimally learn in case when actual ‘ground truth’ may not be available. > Fredriksson T. et al. Data Labeling: An Empirical Investigation into Industrial Challenges and Mitigation Strategies. – 2020. ● This study investigates the challenges that companies experience when annotating and labeling their data > Raykar V. C. et al. Learning from crowds. – 2010. ● The proposed algorithm evaluates the different experts and also gives an estimate of the actual hidden labels > Whitehill J. et al. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. – 2009. ● we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler, and the difficulty of each image > Zhou, Dengyong, et al. "Regularized minimax conditional entropy for crowdsourcing.". (2015). ● minimax conditional entropy principle to infer ground truth from noisy crowdsourced labels > Zhao, Liyue, Gita Sukthankar, and Rahul Sukthankar. "Incremental relabeling for active learning with noisy crowdsourced annotations." 2011. ● Unfortunately, most active learning strategies are myopic and sensitive to label noise, which leads to poorly trained classifiers. We propose an active learning method that is specifically designed to be robust to such noise > Dawid, Alexander Philip, and Allan M. Skene. "Maximum likelihood estimation of observer error‐rates using the EM algorithm." 1979. ● A model is presented which allows individual error‐rates to be estimated for polytomous facets even when the patient's “true” response is not available. The EM algorithm is shown to provide a slow but sure way of obtaining maximum likelihood estimates of the parameters of interest. > Chen, Xi, et al. "Pairwise ranking aggregation in a crowdsourced setting." 2013. ● In contrast to traditional ranking aggregation methods, the approach learns about and folds into consideration the quality of contributions of each annotator.