SlideShare a Scribd company logo
1 of 20
Download to read offline
Progressive Identification of True Labels for
Partial-Label Learning
발표자: 송헌
펀더멘털팀: 김동희, 김지연, 김창연, 이근배, 이재윤
Lv, Jiaqi, et al. ICCV. 2020.
2
Problem setting
In partial-label learning (PLL), each training instance is associated with
a set of candidate labels among which exactly one is true.
The goal of PLL is reducing the overhead of
finding exact label from ambiguous candidates.
3
Related works
Most works are coupled to some specific optimization algorithms.
Therefore, it is difficult to apply them to DNNs.
D2CNN* is the only work that used DNNs with stochastic optimizers.
However, it restricted the networks to some specific architectures.
Complementary-label learning** uses a class that an example does not belong to.
Hence, it can be considered as an extreme PLL case with 𝑐 − 1 candidate labels.
*Yao, et al. "Deep discriminative cnn with temporal ensembling for ambiguously-labeled image classification." AAAI. 2020.
**Ishida, Takashi, et al. "Complementary-label learning for arbitrary losses and models." ICML. 2019
4
Contributions
In the paper,
The authors propose a classifier-consistent risk estimator for PLL theoretically.
They show the classifier learned from partially labeled data converges to
the optimal one learned from ordinarily labeled data.
The authors also propose a model-, loss-, optimizer-agnostic method for PLL.
5
Ordinary Multi-class Classification
Let 𝒳 ⊆ ℝ!
be the instance space and 𝒴 = 1,2, … , 𝑐 be the label space.
Let 𝑝 𝑥, 𝑦 be the underlying joint density of random variables 𝑋, 𝑌 ∈ 𝒳×𝒴.
The goal is to learn a classifier 𝒈: 𝒳 → ℝ" that minimizes the estimator of risk:
ℛ 𝒈 = 𝔼 #,% ~' (,) ℓ 𝒈 𝑋 , 𝑒%
where 𝒆𝒴
= 𝒆+
: 𝑖 ∈ 𝒴 denotes the standard canonical vector.
6
Partial-Label Learning
Let candidate label set 𝑆 be the power set of true label set .
Therefore, we need to train a classifier with partially labeled examples 𝑋, 𝑆 .
The PLL risk estimator is defined over 𝑝 𝑥, 𝑠 :
ℛ,-- 𝒈 = 𝔼 #,. ~' (,/ ℓ,-- 𝒈 𝑋 , 𝑆
where ℓ,--: ℝ"
×𝒫 𝒴 → ℝ.
7
Classifier-Consistent Risk Estimator
To make ℛ,-- 𝒈 estimable, an intuitive way is through a surrogate loss.
The authors consider that only the true label contributes to retrieving the classifier:
For that, they define the PLL loss as the minimal loss over the candidate label set:
ℓ,-- 𝒈 𝑋 , 𝑆 = min
+∈.
ℓ 𝒈 𝑋 , 𝒆+
This leads to a new risk estimator:
ℛ,-- 𝒈 = 𝔼 #,. ~' (,/ min
+∈.
ℓ 𝒈 𝑋 , 𝒆+
8
Lemmas
The ambiguity degree is defined as
𝛾 = sup
1,2 ~3 4,5 ,6~3 /|(,) , 8
%∈𝒴, 8
%9%
Pr H
𝑌 ∈ 𝑆
𝛾 is the maximum probability of a negative label H
𝑌 co-occurs with the true label 𝑌.
The small ambiguity degree condition (𝛾 < 1) implies that except for the true label,
no other labels will be 100% included in the candidate label set.
Moreover, if ℓ is the CE or MSE loss, the ordinary optimal classifier 𝒈∗
satisfies
𝑔+
∗
𝑋 = 𝑝 𝑌 = 𝑖 𝑋 .
9
Connection
Under the deterministic scenario,
if the small ambiguity degree condition is satisfied,
and CE or MSE loss is used, then,
the PLL optimal classifier 𝒈𝑷𝑳𝑳
∗
of ℛ,-- 𝒈 is equivalent to
the ordinary optimal classifier 𝒈∗
of ℛ 𝒈 :
𝒈𝑷𝑳𝑳
∗
= 𝒈∗
10
Estimation Error Bound
Let L
ℛ,-- be the empirical counterpart of ℛ,--, and M
𝑔,-- = argmin L
ℛ,-- 𝒈 be the
empirical risk classifier. Suppose 𝒢) be a class of real functions.
Rademacher complexity of 𝒢) over 𝑝 𝑥 with sample size 𝑛 is defined as ℜ= 𝒢) .
Then, for any 𝛿 > 0, we have with probability as least 1 − 𝛿,
ℛ,-- M
𝑔,-- − ℛ,-- 𝒈𝑷𝑳𝑳
∗
≤ 4 2𝑐𝐿ℓ Y
)?@
"
ℜ= 𝒢) + 2𝑀
log
2
𝛿
2𝑛
Therefore, ℛ,-- M
𝑔,-- → ℛ,-- 𝒈𝑷𝑳𝑳
∗
as the number of training data 𝑛 → ∞.
11
Proposed Method
However, the min operator in ℓ,-- 𝒈 𝑋 , 𝑆 makes optimization difficult,
because if a wrong label 𝑖 is selected in the beginning,
the optimization will focus on the wrong label till the end.
They first require that ℓ can be decomposed into each label
ℓ 𝒈 𝑋 , 𝒆% = Y
+?@
"
ℓ 𝑔+ 𝑋 , 𝑒+
%
Then, the authors relax the min operator by the dynamic weights.
L
ℛ,-- =
1
𝑛
Y
+?@
=
Y
A?@
"
𝑤+,Aℓ 𝑔A 𝑥+ , 𝑒A
/!
where 𝑒A
/!
is the 𝑗-th coordinate of 𝒆/! and 𝒆/! = ∑B∈/!
𝒆B
12
Proposed Method
Ideally, the label with weight 1 is exactly the true label and 0 otherwise.
Since the weights are latent, the minimizer of L
ℛ,-- cannot be solved directly.
Inspiring by the EM algorithm, the authors put more weights on more possible labels:
𝑤+,A = b
𝑔A 𝑥+
∑B∈/!
𝑔B 𝑥+
, 𝑗 ∈ 𝑠+
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
If the small ambiguity degree condition is satisfied, models tend to remember the
true labels in the initial epochs, which guides the model towards a discriminative
classifier giving relatively low losses for more possible true labels.
13
Proposed Method
While they follow the EM algorithm,
they merge the E-step and M-step.
The weights can be updated at any epoch
such that the local convergence within
each epoch is not necessary.
Therefore, they gets rid of the overfitting
issues of EM methods.
14
Datasets
The authors used widely used benchmark datasets,
MNIST, Fashion-MNIST, Kuzushiji-MNIST, and CIFAR10
And five small datasets from UCI,
Yeast, Texture, Dermatology, Synthetic Control, and 20Newgroups.
They randomly flipped the negative label to positive label with probability 𝑞.
Moreover, they used real-world partial-label datasets,
Lost, Birdsong, MSRCv2, Soccer Player, and Yahoo! News.
15
Baselines
They compared the proposed method (PRODEN) with:
• PRODEN-itera: update the label weights every 100 epoch
• PRODEN-sudden: update weights 𝑤+,B = 1 if argmaxA∈/!
𝑔A(𝑥+) and 0 otherwise
• PRODEN-naïve: never update the weights but use uniform weights
• PN-oracle: train a model with ordinary labels
• PN-decomp: decompose one instance with multiple candidate labels
into many instances each one single label
• D2CNN: a PLL method based on DNN
• GA: a CLL method based on DNN
16
Results on Benchmark Datasets
When 𝑞 = 0.1, PRODEN is always the best method and comparable to PN-oracle.
The performance of PRODEN-itera deteriorates drastically with complex models
because of the overfitting issues.
17
Results on Benchmark Datasets
When 𝑞 = 0.7, PRODEN is still comparable to PN-oracle.
The superiority always stands out for PRODEN compared with D2CNN and GA.
18
Analysis on the Ambiguity Degree
They also gradually move 𝑞 from 0.5 to 0.9 to simulate 𝛾(𝛾 → 𝑞 as 𝑛 → ∞).
PRODEN tends to be less affected with increased ambiguity.
19
Results on Real-world Datasets
They compare the proposed method with classical PLL methods,
SURE, CLPL, ECOC, PLSVM, PLkNN, and IPAL
which can hardly be implemented by DNNs on real-world and small-scale datasets.
20
Results on Small-scale Datasets

More Related Content

What's hot

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Handling Missing Attributes using Matrix Factorization 
Handling Missing Attributes using Matrix Factorization Handling Missing Attributes using Matrix Factorization 
Handling Missing Attributes using Matrix Factorization CS, NcState
 
Machine learning and_nlp
Machine learning and_nlpMachine learning and_nlp
Machine learning and_nlpankit_ppt
 
ADABoost classifier
ADABoost classifierADABoost classifier
ADABoost classifierSreerajVA
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoHridyesh Bisht
 
Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Zihui Li
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Machine Learning: An Introduction Fu Chang
Machine Learning: An Introduction Fu ChangMachine Learning: An Introduction Fu Chang
Machine Learning: An Introduction Fu Changbutest
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descentankit_ppt
 
Algorithms Design Patterns
Algorithms Design PatternsAlgorithms Design Patterns
Algorithms Design PatternsAshwin Shiv
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence butest
 

What's hot (20)

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Handling Missing Attributes using Matrix Factorization 
Handling Missing Attributes using Matrix Factorization Handling Missing Attributes using Matrix Factorization 
Handling Missing Attributes using Matrix Factorization 
 
Machine learning and_nlp
Machine learning and_nlpMachine learning and_nlp
Machine learning and_nlp
 
ADABoost classifier
ADABoost classifierADABoost classifier
ADABoost classifier
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demo
 
GBM theory code and parameters
GBM theory code and parametersGBM theory code and parameters
GBM theory code and parameters
 
Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Machine Learning: An Introduction Fu Chang
Machine Learning: An Introduction Fu ChangMachine Learning: An Introduction Fu Chang
Machine Learning: An Introduction Fu Chang
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
Ppt shuai
Ppt shuaiPpt shuai
Ppt shuai
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descent
 
Algorithms Design Patterns
Algorithms Design PatternsAlgorithms Design Patterns
Algorithms Design Patterns
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 

Similar to Progressive identification of true labels for partial label learning

MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...Golden Helix Inc
 
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...Golden Helix Inc
 
Batch gradient method for training of
Batch gradient method for training ofBatch gradient method for training of
Batch gradient method for training ofijaia
 
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...AmirParnianifard1
 
Bag of Pursuits and Neural Gas for Improved Sparse Codin
Bag of Pursuits and Neural Gas for Improved Sparse CodinBag of Pursuits and Neural Gas for Improved Sparse Codin
Bag of Pursuits and Neural Gas for Improved Sparse CodinKarlos Svoboda
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
Artificial Neural Networks Deep Learning Report
Artificial Neural Networks   Deep Learning ReportArtificial Neural Networks   Deep Learning Report
Artificial Neural Networks Deep Learning ReportLisa Muthukumar
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Marina Santini
 
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...TELKOMNIKA JOURNAL
 
Meta Pseudo Label - InsideAIML
Meta Pseudo Label - InsideAIMLMeta Pseudo Label - InsideAIML
Meta Pseudo Label - InsideAIMLVijaySharma802
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...cscpconf
 
Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Max Kleiner
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxHaibinSu2
 
Instance based learning
Instance based learningInstance based learning
Instance based learningswapnac12
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 

Similar to Progressive identification of true labels for partial label learning (20)

MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...
MM-KBAC – Using Mixed Models to Adjust for Population Structure in a Rare-var...
 
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
 
Batch gradient method for training of
Batch gradient method for training ofBatch gradient method for training of
Batch gradient method for training of
 
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...
 
Bag of Pursuits and Neural Gas for Improved Sparse Codin
Bag of Pursuits and Neural Gas for Improved Sparse CodinBag of Pursuits and Neural Gas for Improved Sparse Codin
Bag of Pursuits and Neural Gas for Improved Sparse Codin
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Artificial Neural Networks Deep Learning Report
Artificial Neural Networks   Deep Learning ReportArtificial Neural Networks   Deep Learning Report
Artificial Neural Networks Deep Learning Report
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...
Contradictory of the Laplacian Smoothing Transform and Linear Discriminant An...
 
Meta Pseudo Label - InsideAIML
Meta Pseudo Label - InsideAIMLMeta Pseudo Label - InsideAIML
Meta Pseudo Label - InsideAIML
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
Group Project
Group ProjectGroup Project
Group Project
 
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...
AN EMPIRICAL COMPARISON OF WEIGHTING FUNCTIONS FOR MULTI-LABEL DISTANCEWEIGHT...
 
Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62Machine Learning Guide maXbox Starter62
Machine Learning Guide maXbox Starter62
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
ML MODULE 4.pdf
ML MODULE 4.pdfML MODULE 4.pdf
ML MODULE 4.pdf
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 

More from taeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptxtaeseon ryu
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Modelstaeseon ryu
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuningtaeseon ryu
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithmtaeseon ryu
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu
 

More from taeseon ryu (20)

VoxelNet
VoxelNetVoxelNet
VoxelNet
 
OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
JetsonTX2 Python
 JetsonTX2 Python  JetsonTX2 Python
JetsonTX2 Python
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptx
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
RL_UpsideDown
RL_UpsideDownRL_UpsideDown
RL_UpsideDown
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learning
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
mPLUG
mPLUGmPLUG
mPLUG
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithm
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networks
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarization
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Recently uploaded (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 

Progressive identification of true labels for partial label learning

  • 1. Progressive Identification of True Labels for Partial-Label Learning 발표자: 송헌 펀더멘털팀: 김동희, 김지연, 김창연, 이근배, 이재윤 Lv, Jiaqi, et al. ICCV. 2020.
  • 2. 2 Problem setting In partial-label learning (PLL), each training instance is associated with a set of candidate labels among which exactly one is true. The goal of PLL is reducing the overhead of finding exact label from ambiguous candidates.
  • 3. 3 Related works Most works are coupled to some specific optimization algorithms. Therefore, it is difficult to apply them to DNNs. D2CNN* is the only work that used DNNs with stochastic optimizers. However, it restricted the networks to some specific architectures. Complementary-label learning** uses a class that an example does not belong to. Hence, it can be considered as an extreme PLL case with 𝑐 − 1 candidate labels. *Yao, et al. "Deep discriminative cnn with temporal ensembling for ambiguously-labeled image classification." AAAI. 2020. **Ishida, Takashi, et al. "Complementary-label learning for arbitrary losses and models." ICML. 2019
  • 4. 4 Contributions In the paper, The authors propose a classifier-consistent risk estimator for PLL theoretically. They show the classifier learned from partially labeled data converges to the optimal one learned from ordinarily labeled data. The authors also propose a model-, loss-, optimizer-agnostic method for PLL.
  • 5. 5 Ordinary Multi-class Classification Let 𝒳 ⊆ ℝ! be the instance space and 𝒴 = 1,2, … , 𝑐 be the label space. Let 𝑝 𝑥, 𝑦 be the underlying joint density of random variables 𝑋, 𝑌 ∈ 𝒳×𝒴. The goal is to learn a classifier 𝒈: 𝒳 → ℝ" that minimizes the estimator of risk: ℛ 𝒈 = 𝔼 #,% ~' (,) ℓ 𝒈 𝑋 , 𝑒% where 𝒆𝒴 = 𝒆+ : 𝑖 ∈ 𝒴 denotes the standard canonical vector.
  • 6. 6 Partial-Label Learning Let candidate label set 𝑆 be the power set of true label set . Therefore, we need to train a classifier with partially labeled examples 𝑋, 𝑆 . The PLL risk estimator is defined over 𝑝 𝑥, 𝑠 : ℛ,-- 𝒈 = 𝔼 #,. ~' (,/ ℓ,-- 𝒈 𝑋 , 𝑆 where ℓ,--: ℝ" ×𝒫 𝒴 → ℝ.
  • 7. 7 Classifier-Consistent Risk Estimator To make ℛ,-- 𝒈 estimable, an intuitive way is through a surrogate loss. The authors consider that only the true label contributes to retrieving the classifier: For that, they define the PLL loss as the minimal loss over the candidate label set: ℓ,-- 𝒈 𝑋 , 𝑆 = min +∈. ℓ 𝒈 𝑋 , 𝒆+ This leads to a new risk estimator: ℛ,-- 𝒈 = 𝔼 #,. ~' (,/ min +∈. ℓ 𝒈 𝑋 , 𝒆+
  • 8. 8 Lemmas The ambiguity degree is defined as 𝛾 = sup 1,2 ~3 4,5 ,6~3 /|(,) , 8 %∈𝒴, 8 %9% Pr H 𝑌 ∈ 𝑆 𝛾 is the maximum probability of a negative label H 𝑌 co-occurs with the true label 𝑌. The small ambiguity degree condition (𝛾 < 1) implies that except for the true label, no other labels will be 100% included in the candidate label set. Moreover, if ℓ is the CE or MSE loss, the ordinary optimal classifier 𝒈∗ satisfies 𝑔+ ∗ 𝑋 = 𝑝 𝑌 = 𝑖 𝑋 .
  • 9. 9 Connection Under the deterministic scenario, if the small ambiguity degree condition is satisfied, and CE or MSE loss is used, then, the PLL optimal classifier 𝒈𝑷𝑳𝑳 ∗ of ℛ,-- 𝒈 is equivalent to the ordinary optimal classifier 𝒈∗ of ℛ 𝒈 : 𝒈𝑷𝑳𝑳 ∗ = 𝒈∗
  • 10. 10 Estimation Error Bound Let L ℛ,-- be the empirical counterpart of ℛ,--, and M 𝑔,-- = argmin L ℛ,-- 𝒈 be the empirical risk classifier. Suppose 𝒢) be a class of real functions. Rademacher complexity of 𝒢) over 𝑝 𝑥 with sample size 𝑛 is defined as ℜ= 𝒢) . Then, for any 𝛿 > 0, we have with probability as least 1 − 𝛿, ℛ,-- M 𝑔,-- − ℛ,-- 𝒈𝑷𝑳𝑳 ∗ ≤ 4 2𝑐𝐿ℓ Y )?@ " ℜ= 𝒢) + 2𝑀 log 2 𝛿 2𝑛 Therefore, ℛ,-- M 𝑔,-- → ℛ,-- 𝒈𝑷𝑳𝑳 ∗ as the number of training data 𝑛 → ∞.
  • 11. 11 Proposed Method However, the min operator in ℓ,-- 𝒈 𝑋 , 𝑆 makes optimization difficult, because if a wrong label 𝑖 is selected in the beginning, the optimization will focus on the wrong label till the end. They first require that ℓ can be decomposed into each label ℓ 𝒈 𝑋 , 𝒆% = Y +?@ " ℓ 𝑔+ 𝑋 , 𝑒+ % Then, the authors relax the min operator by the dynamic weights. L ℛ,-- = 1 𝑛 Y +?@ = Y A?@ " 𝑤+,Aℓ 𝑔A 𝑥+ , 𝑒A /! where 𝑒A /! is the 𝑗-th coordinate of 𝒆/! and 𝒆/! = ∑B∈/! 𝒆B
  • 12. 12 Proposed Method Ideally, the label with weight 1 is exactly the true label and 0 otherwise. Since the weights are latent, the minimizer of L ℛ,-- cannot be solved directly. Inspiring by the EM algorithm, the authors put more weights on more possible labels: 𝑤+,A = b 𝑔A 𝑥+ ∑B∈/! 𝑔B 𝑥+ , 𝑗 ∈ 𝑠+ 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 If the small ambiguity degree condition is satisfied, models tend to remember the true labels in the initial epochs, which guides the model towards a discriminative classifier giving relatively low losses for more possible true labels.
  • 13. 13 Proposed Method While they follow the EM algorithm, they merge the E-step and M-step. The weights can be updated at any epoch such that the local convergence within each epoch is not necessary. Therefore, they gets rid of the overfitting issues of EM methods.
  • 14. 14 Datasets The authors used widely used benchmark datasets, MNIST, Fashion-MNIST, Kuzushiji-MNIST, and CIFAR10 And five small datasets from UCI, Yeast, Texture, Dermatology, Synthetic Control, and 20Newgroups. They randomly flipped the negative label to positive label with probability 𝑞. Moreover, they used real-world partial-label datasets, Lost, Birdsong, MSRCv2, Soccer Player, and Yahoo! News.
  • 15. 15 Baselines They compared the proposed method (PRODEN) with: • PRODEN-itera: update the label weights every 100 epoch • PRODEN-sudden: update weights 𝑤+,B = 1 if argmaxA∈/! 𝑔A(𝑥+) and 0 otherwise • PRODEN-naïve: never update the weights but use uniform weights • PN-oracle: train a model with ordinary labels • PN-decomp: decompose one instance with multiple candidate labels into many instances each one single label • D2CNN: a PLL method based on DNN • GA: a CLL method based on DNN
  • 16. 16 Results on Benchmark Datasets When 𝑞 = 0.1, PRODEN is always the best method and comparable to PN-oracle. The performance of PRODEN-itera deteriorates drastically with complex models because of the overfitting issues.
  • 17. 17 Results on Benchmark Datasets When 𝑞 = 0.7, PRODEN is still comparable to PN-oracle. The superiority always stands out for PRODEN compared with D2CNN and GA.
  • 18. 18 Analysis on the Ambiguity Degree They also gradually move 𝑞 from 0.5 to 0.9 to simulate 𝛾(𝛾 → 𝑞 as 𝑛 → ∞). PRODEN tends to be less affected with increased ambiguity.
  • 19. 19 Results on Real-world Datasets They compare the proposed method with classical PLL methods, SURE, CLPL, ECOC, PLSVM, PLkNN, and IPAL which can hardly be implemented by DNNs on real-world and small-scale datasets.