SlideShare a Scribd company logo
1 of 14
Joint Optimization Framework for Learning
with Noisy Labels
Author : Daiki Tanaka, Daiki Ikami,
Toshihiko Yamasaki, Kiyoharu Aizawa
Publish: CVPR2018
Problem
 Many large-scale datasets are collected
from websites, however they tend to contain
inaccurate labels that are termed as noisy
labels
Image :
Noisy label : dog
Clean label : horse
Goal
 A joint optimization framework of learning
DNN parameter and estimating true labels.
 Then, train a usual image classification on
these estimated labels.
Label
 Hard-label spaces H = {y : y ϵ {0, 1}c, 1Ty = 1}
Ex : yT = [ 0, 1, 0] with c = 3
 Soft-label spaces S = {y : y ϵ [0, 1]c, 1Ty = 1}
Ex : yT = [ 0.2, 0.7, 0.1] with c = 3
Parameters
c : number of classes
y : label (column vector)
1 : column vector of all one
The concept of joint optimization
framework
The concept of joint optimization
framework
Algorithm 1 Alternating Optimization
for t  1 to num_epochs do
update θ(t+1) by SGD on L(θ(t),Y(t)|X)
update Y(t+1) by (hard-label)
or (soft-label)
end for
Loss – Joint Optimization Framework
►Loss function
► L(θ,Y|X) = Lc(θ,Y|X)+αLp(θ|X)+βLe(θ|X)
►Optimization
► arg min L(θ,Y|X)
►Parameters
► Y : label
► X : Image
► θ : parameters of network
► α : hyperparameter
► β : hyperparameter
Loss – Joint Optimization Framework
►First term
► Lc(θ,Y|X) =
1
n i=1
n
DKL(yi||s(θ, xi))
►Parameters
► Y : label
► X : image
► θ : parameters of network
► s : prediction of network
► n : train set size
Loss function – usual image
classification network
►Loss function
► L = −
1
n i=1
n
j=1
c
yij
GT
logsj θ, xi
►Optimization
► arg min L(θ|X,Y)
►Parameters
► L : cross entropy between probability distribution y and s
► n : train set size
► c : number of class
► Y : label (ground truth)
► s : prediction of network
Loss – Joint Optimization Framework
► Second term
► LP = j=1
c
pj log
pj
sj(θ,X)
► s θ, X =
1
n i=1
n
s θ, xi ≈
1
β xϵβ s(θ, x)
►Parameters
► p : prior probability distribution(distribution of
classes among all training data)
► X : image
► s : prediction of network
► θ : parameter of network
► c : number of classes
► n : train set size
► β : batch size
Ex:
In CIFAR-10, the p will be [0.1, 0.1 ,0.1, 0.1,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1]. Because each
classes has the same number of images in
CIFAR-10.
Loss – Joint Optimization Framework
► c : number of classes
► n : train set size
►Third term
► Le = −
1
n i=1
n
j=1
c
sj(θ, xi)logsj θ, xi
► Ex:
► Epoch t : s = [0.2,0.8]
► Epoch t+1 : s = [0.1,0.9]
►Parameters
► X : image
► s : prediction of network
► θ : parameter of network
L θ, Y X = Lc θ, Y X + αLp θ X + βLe(θ|X)
other strategy – large learning rate
►Experiment
► test accuracy remains high
at the end of training when
the learning rate is high.
►Parameters
► X-axis : epoch
► Y-axis : test accuracy
► r : noise rate
► lr : learning rate
Experiment on SN-CIFAR10
best : the scores of the epoch where the validation
accuracy is optimal
last : the scores at the end of training
Test accuracy : Performance on test set
Recovery accuracy : Performance on the train set
yi =
yi
GT
with the probability of 1 − r
random one − hot vector with the probability of r
Experiment on Clothing1M dataset

More Related Content

What's hot

Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Pooyan Jamshidi
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Md2k 0219 shang
Md2k 0219 shangMd2k 0219 shang
Md2k 0219 shangBBKuhn
 
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector MachinesDongseo University
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
A2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solA2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solMaynaShah1
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyIntroduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyTed Xiao
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines SimplyEmad Nabil
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVMHonglin Yu
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsDmitriy Selivanov
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)VARUN KUMAR
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Khor SoonHin
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowKhor SoonHin
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Triangle counting handout
Triangle counting handoutTriangle counting handout
Triangle counting handoutcsedays
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert홍배 김
 

What's hot (20)

Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Md2k 0219 shang
Md2k 0219 shangMd2k 0219 shang
Md2k 0219 shang
 
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
 
Tensor board
Tensor boardTensor board
Tensor board
 
A2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solA2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_sol
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyIntroduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at Berkeley
 
Lec 3-mcgregor
Lec 3-mcgregorLec 3-mcgregor
Lec 3-mcgregor
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines Simply
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVM
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to Tensorflow
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Triangle counting handout
Triangle counting handoutTriangle counting handout
Triangle counting handout
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert
 

Similar to Joint optimization framework for learning with noisy labels

机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Text classification
Text classificationText classification
Text classificationFraboni Ec
 
Text classification
Text classificationText classification
Text classificationDavid Hoen
 
Text classification
Text classificationText classification
Text classificationTony Nguyen
 
Text classification
Text classificationText classification
Text classificationYoung Alista
 
Text classification
Text classificationText classification
Text classificationHarry Potter
 
Text classification
Text classificationText classification
Text classificationJames Wong
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for TradingLarry Guo
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)Pierre Schaus
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 

Similar to Joint optimization framework for learning with noisy labels (20)

机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Signals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptxSignals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptx
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for Trading
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 

Recently uploaded

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxnada99848
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptx
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Joint optimization framework for learning with noisy labels

  • 1. Joint Optimization Framework for Learning with Noisy Labels Author : Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa Publish: CVPR2018
  • 2. Problem  Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels Image : Noisy label : dog Clean label : horse
  • 3. Goal  A joint optimization framework of learning DNN parameter and estimating true labels.  Then, train a usual image classification on these estimated labels.
  • 4. Label  Hard-label spaces H = {y : y ϵ {0, 1}c, 1Ty = 1} Ex : yT = [ 0, 1, 0] with c = 3  Soft-label spaces S = {y : y ϵ [0, 1]c, 1Ty = 1} Ex : yT = [ 0.2, 0.7, 0.1] with c = 3 Parameters c : number of classes y : label (column vector) 1 : column vector of all one
  • 5. The concept of joint optimization framework
  • 6. The concept of joint optimization framework Algorithm 1 Alternating Optimization for t  1 to num_epochs do update θ(t+1) by SGD on L(θ(t),Y(t)|X) update Y(t+1) by (hard-label) or (soft-label) end for
  • 7. Loss – Joint Optimization Framework ►Loss function ► L(θ,Y|X) = Lc(θ,Y|X)+αLp(θ|X)+βLe(θ|X) ►Optimization ► arg min L(θ,Y|X) ►Parameters ► Y : label ► X : Image ► θ : parameters of network ► α : hyperparameter ► β : hyperparameter
  • 8. Loss – Joint Optimization Framework ►First term ► Lc(θ,Y|X) = 1 n i=1 n DKL(yi||s(θ, xi)) ►Parameters ► Y : label ► X : image ► θ : parameters of network ► s : prediction of network ► n : train set size
  • 9. Loss function – usual image classification network ►Loss function ► L = − 1 n i=1 n j=1 c yij GT logsj θ, xi ►Optimization ► arg min L(θ|X,Y) ►Parameters ► L : cross entropy between probability distribution y and s ► n : train set size ► c : number of class ► Y : label (ground truth) ► s : prediction of network
  • 10. Loss – Joint Optimization Framework ► Second term ► LP = j=1 c pj log pj sj(θ,X) ► s θ, X = 1 n i=1 n s θ, xi ≈ 1 β xϵβ s(θ, x) ►Parameters ► p : prior probability distribution(distribution of classes among all training data) ► X : image ► s : prediction of network ► θ : parameter of network ► c : number of classes ► n : train set size ► β : batch size Ex: In CIFAR-10, the p will be [0.1, 0.1 ,0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]. Because each classes has the same number of images in CIFAR-10.
  • 11. Loss – Joint Optimization Framework ► c : number of classes ► n : train set size ►Third term ► Le = − 1 n i=1 n j=1 c sj(θ, xi)logsj θ, xi ► Ex: ► Epoch t : s = [0.2,0.8] ► Epoch t+1 : s = [0.1,0.9] ►Parameters ► X : image ► s : prediction of network ► θ : parameter of network L θ, Y X = Lc θ, Y X + αLp θ X + βLe(θ|X)
  • 12. other strategy – large learning rate ►Experiment ► test accuracy remains high at the end of training when the learning rate is high. ►Parameters ► X-axis : epoch ► Y-axis : test accuracy ► r : noise rate ► lr : learning rate
  • 13. Experiment on SN-CIFAR10 best : the scores of the epoch where the validation accuracy is optimal last : the scores at the end of training Test accuracy : Performance on test set Recovery accuracy : Performance on the train set yi = yi GT with the probability of 1 − r random one − hot vector with the probability of r

Editor's Notes

  1. The paper I want to present is ‘Joint Optimization Framework for Learning with Noisy Labels’. The author is Daiki Tanaka. This paper is published on CVPR2018.
  2. Deep Neural Networks have reached a significant performance on image classification. However, many datasets are collected from websites. Therefore, they tend to contain noisy labels. These noisy labels will decrease the performance of the network.
  3. Hence, the author propose a joint optimization framework for image classification. This framework will estimate true labels for the classification network.
  4. Before start, there are two kind of label for image classification. For hard-label, the value in y is either 1 or 0, and there summation should be 1. For soft-label, the value in y is between 0 and 1, and there summation should be 1.
  5. X is image, Y is label, CNN is convolution neural network for image classification, L is loss function, S is the probability prediction of network,called soft label, format is in one hot. There are two different terms between this frame work and usual image classification framework. The loss function and label. They opposed to treating the label as fixed because they are noisy label. Therefore, the labels are alternatively updated for each epoch.
  6. Let’s look at the algorithm first. I will explain the loss function later. The alogorithm is simple. In each epoch. They just update the parameter of network by optimizer. Then update the label by the prediction of network.
  7. Lc is KL divergence between label and prediction of network. When y is fixed, minimize KL divigence is the same as minimize cross entropy. Therefore,this term is the same as the loss function of usual image classification network.
  8. In the usual image classification network. We just use cross entropy between label and prediction of network. Try to find a parameter theta to minimize the loss function.
  9. Second term is the KL divergence between prior probability distribution p and mean probability s bar. S bar is the mean probability in the training data, in the implantation , they approximinate it by batch. However, this approximinate can not treat a large number of classes and extreme imbalanced classes. This term will make the prediction of network follow the distribution p.
  10. The final term is the entropy of prediction of network. This term is requested for the training loss when we used soft label as label. With alpha and beta is zeros and we update the label by soft label. Both theta and label will be stuck in local optima and the learning process does not proceed. To overcome this problem, this term will concentrate the probability distribution of each soft label to a single class.
  11. By the experimiment, test accuracy remains high at the end of training when the learning rate is high.
  12. Symmetric noise cifar 10 is based on cifar10 dataset and ther are probability of r to changed the label of an image. Best, is the test accuracy on validation set. Last, is the test accuracy on testing set. There method reach the state of the art on CIFAR10. They also experiment their method on AN-CIFAR10 and PL-CIFAR and the performance is well.
  13. They use clothing1M dataset to examine the performance of their method in a real setting The images of this dataset are crawled from online shop and the label are generated by using the surrounding texts of the images on the website. noisy label is 61.54% Comparable performance on the clothing1M dataset.