SlideShare a Scribd company logo
1 of 14
Joint Optimization Framework for Learning
with Noisy Labels
Author : Daiki Tanaka, Daiki Ikami,
Toshihiko Yamasaki, Kiyoharu Aizawa
Publish: CVPR2018
Problem
 Many large-scale datasets are collected
from websites, however they tend to contain
inaccurate labels that are termed as noisy
labels
Image :
Noisy label : dog
Clean label : horse
Goal
 A joint optimization framework of learning
DNN parameter and estimating true labels.
 Then, train a usual image classification on
these estimated labels.
Label
 Hard-label spaces H = {y : y ϵ {0, 1}c, 1Ty = 1}
Ex : yT = [ 0, 1, 0] with c = 3
 Soft-label spaces S = {y : y ϵ [0, 1]c, 1Ty = 1}
Ex : yT = [ 0.2, 0.7, 0.1] with c = 3
Parameters
c : number of classes
y : label (column vector)
1 : column vector of all one
The concept of joint optimization
framework
The concept of joint optimization
framework
Algorithm 1 Alternating Optimization
for t  1 to num_epochs do
update θ(t+1) by SGD on L(θ(t),Y(t)|X)
update Y(t+1) by (hard-label)
or (soft-label)
end for
Loss – Joint Optimization Framework
►Loss function
► L(θ,Y|X) = Lc(θ,Y|X)+αLp(θ|X)+βLe(θ|X)
►Optimization
► arg min L(θ,Y|X)
►Parameters
► Y : label
► X : Image
► θ : parameters of network
► α : hyperparameter
► β : hyperparameter
Loss – Joint Optimization Framework
►First term
► Lc(θ,Y|X) =
1
n i=1
n
DKL(yi||s(θ, xi))
►Parameters
► Y : label
► X : image
► θ : parameters of network
► s : prediction of network
► n : train set size
Loss function – usual image
classification network
►Loss function
► L = −
1
n i=1
n
j=1
c
yij
GT
logsj θ, xi
►Optimization
► arg min L(θ|X,Y)
►Parameters
► L : cross entropy between probability distribution y and s
► n : train set size
► c : number of class
► Y : label (ground truth)
► s : prediction of network
Loss – Joint Optimization Framework
► Second term
► LP = j=1
c
pj log
pj
sj(θ,X)
► s θ, X =
1
n i=1
n
s θ, xi ≈
1
β xϵβ s(θ, x)
►Parameters
► p : prior probability distribution(distribution of
classes among all training data)
► X : image
► s : prediction of network
► θ : parameter of network
► c : number of classes
► n : train set size
► β : batch size
Ex:
In CIFAR-10, the p will be [0.1, 0.1 ,0.1, 0.1,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1]. Because each
classes has the same number of images in
CIFAR-10.
Loss – Joint Optimization Framework
► c : number of classes
► n : train set size
►Third term
► Le = −
1
n i=1
n
j=1
c
sj(θ, xi)logsj θ, xi
► Ex:
► Epoch t : s = [0.2,0.8]
► Epoch t+1 : s = [0.1,0.9]
►Parameters
► X : image
► s : prediction of network
► θ : parameter of network
L θ, Y X = Lc θ, Y X + αLp θ X + βLe(θ|X)
other strategy – large learning rate
►Experiment
► test accuracy remains high
at the end of training when
the learning rate is high.
►Parameters
► X-axis : epoch
► Y-axis : test accuracy
► r : noise rate
► lr : learning rate
Experiment on SN-CIFAR10
best : the scores of the epoch where the validation
accuracy is optimal
last : the scores at the end of training
Test accuracy : Performance on test set
Recovery accuracy : Performance on the train set
yi =
yi
GT
with the probability of 1 − r
random one − hot vector with the probability of r
Experiment on Clothing1M dataset

More Related Content

What's hot

Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Pooyan Jamshidi
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Md2k 0219 shang
Md2k 0219 shangMd2k 0219 shang
Md2k 0219 shangBBKuhn
 
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector MachinesDongseo University
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
A2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solA2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solMaynaShah1
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyIntroduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyTed Xiao
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines SimplyEmad Nabil
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVMHonglin Yu
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsDmitriy Selivanov
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)VARUN KUMAR
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Khor SoonHin
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowKhor SoonHin
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Triangle counting handout
Triangle counting handoutTriangle counting handout
Triangle counting handoutcsedays
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert홍배 김
 

What's hot (20)

Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Md2k 0219 shang
Md2k 0219 shangMd2k 0219 shang
Md2k 0219 shang
 
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
2013-1 Machine Learning Lecture 05 - Andrew Moore - Support Vector Machines
 
Tensor board
Tensor boardTensor board
Tensor board
 
A2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_solA2 python basics_nptel_pds2_sol
A2 python basics_nptel_pds2_sol
 
Complex numbers polynomial multiplication
Complex numbers polynomial multiplicationComplex numbers polynomial multiplication
Complex numbers polynomial multiplication
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyIntroduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at Berkeley
 
Lec 3-mcgregor
Lec 3-mcgregorLec 3-mcgregor
Lec 3-mcgregor
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines Simply
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVM
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to Tensorflow
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Triangle counting handout
Triangle counting handoutTriangle counting handout
Triangle counting handout
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert
 

Similar to Joint optimization framework for learning with noisy labels

机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systemsrecsysfr
 
Text classification
Text classificationText classification
Text classificationFraboni Ec
 
Text classification
Text classificationText classification
Text classificationDavid Hoen
 
Text classification
Text classificationText classification
Text classificationJames Wong
 
Text classification
Text classificationText classification
Text classificationTony Nguyen
 
Text classification
Text classificationText classification
Text classificationYoung Alista
 
Text classification
Text classificationText classification
Text classificationHarry Potter
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for TradingLarry Guo
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learningKazuki Fujikawa
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)Pierre Schaus
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 

Similar to Joint optimization framework for learning with noisy labels (20)

机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Signals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptxSignals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptx
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Machine Learning for Trading
Machine Learning for TradingMachine Learning for Trading
Machine Learning for Trading
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 

Recently uploaded (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

Joint optimization framework for learning with noisy labels

  • 1. Joint Optimization Framework for Learning with Noisy Labels Author : Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa Publish: CVPR2018
  • 2. Problem  Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels Image : Noisy label : dog Clean label : horse
  • 3. Goal  A joint optimization framework of learning DNN parameter and estimating true labels.  Then, train a usual image classification on these estimated labels.
  • 4. Label  Hard-label spaces H = {y : y ϵ {0, 1}c, 1Ty = 1} Ex : yT = [ 0, 1, 0] with c = 3  Soft-label spaces S = {y : y ϵ [0, 1]c, 1Ty = 1} Ex : yT = [ 0.2, 0.7, 0.1] with c = 3 Parameters c : number of classes y : label (column vector) 1 : column vector of all one
  • 5. The concept of joint optimization framework
  • 6. The concept of joint optimization framework Algorithm 1 Alternating Optimization for t  1 to num_epochs do update θ(t+1) by SGD on L(θ(t),Y(t)|X) update Y(t+1) by (hard-label) or (soft-label) end for
  • 7. Loss – Joint Optimization Framework ►Loss function ► L(θ,Y|X) = Lc(θ,Y|X)+αLp(θ|X)+βLe(θ|X) ►Optimization ► arg min L(θ,Y|X) ►Parameters ► Y : label ► X : Image ► θ : parameters of network ► α : hyperparameter ► β : hyperparameter
  • 8. Loss – Joint Optimization Framework ►First term ► Lc(θ,Y|X) = 1 n i=1 n DKL(yi||s(θ, xi)) ►Parameters ► Y : label ► X : image ► θ : parameters of network ► s : prediction of network ► n : train set size
  • 9. Loss function – usual image classification network ►Loss function ► L = − 1 n i=1 n j=1 c yij GT logsj θ, xi ►Optimization ► arg min L(θ|X,Y) ►Parameters ► L : cross entropy between probability distribution y and s ► n : train set size ► c : number of class ► Y : label (ground truth) ► s : prediction of network
  • 10. Loss – Joint Optimization Framework ► Second term ► LP = j=1 c pj log pj sj(θ,X) ► s θ, X = 1 n i=1 n s θ, xi ≈ 1 β xϵβ s(θ, x) ►Parameters ► p : prior probability distribution(distribution of classes among all training data) ► X : image ► s : prediction of network ► θ : parameter of network ► c : number of classes ► n : train set size ► β : batch size Ex: In CIFAR-10, the p will be [0.1, 0.1 ,0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]. Because each classes has the same number of images in CIFAR-10.
  • 11. Loss – Joint Optimization Framework ► c : number of classes ► n : train set size ►Third term ► Le = − 1 n i=1 n j=1 c sj(θ, xi)logsj θ, xi ► Ex: ► Epoch t : s = [0.2,0.8] ► Epoch t+1 : s = [0.1,0.9] ►Parameters ► X : image ► s : prediction of network ► θ : parameter of network L θ, Y X = Lc θ, Y X + αLp θ X + βLe(θ|X)
  • 12. other strategy – large learning rate ►Experiment ► test accuracy remains high at the end of training when the learning rate is high. ►Parameters ► X-axis : epoch ► Y-axis : test accuracy ► r : noise rate ► lr : learning rate
  • 13. Experiment on SN-CIFAR10 best : the scores of the epoch where the validation accuracy is optimal last : the scores at the end of training Test accuracy : Performance on test set Recovery accuracy : Performance on the train set yi = yi GT with the probability of 1 − r random one − hot vector with the probability of r

Editor's Notes

  1. The paper I want to present is ‘Joint Optimization Framework for Learning with Noisy Labels’. The author is Daiki Tanaka. This paper is published on CVPR2018.
  2. Deep Neural Networks have reached a significant performance on image classification. However, many datasets are collected from websites. Therefore, they tend to contain noisy labels. These noisy labels will decrease the performance of the network.
  3. Hence, the author propose a joint optimization framework for image classification. This framework will estimate true labels for the classification network.
  4. Before start, there are two kind of label for image classification. For hard-label, the value in y is either 1 or 0, and there summation should be 1. For soft-label, the value in y is between 0 and 1, and there summation should be 1.
  5. X is image, Y is label, CNN is convolution neural network for image classification, L is loss function, S is the probability prediction of network,called soft label, format is in one hot. There are two different terms between this frame work and usual image classification framework. The loss function and label. They opposed to treating the label as fixed because they are noisy label. Therefore, the labels are alternatively updated for each epoch.
  6. Let’s look at the algorithm first. I will explain the loss function later. The alogorithm is simple. In each epoch. They just update the parameter of network by optimizer. Then update the label by the prediction of network.
  7. Lc is KL divergence between label and prediction of network. When y is fixed, minimize KL divigence is the same as minimize cross entropy. Therefore,this term is the same as the loss function of usual image classification network.
  8. In the usual image classification network. We just use cross entropy between label and prediction of network. Try to find a parameter theta to minimize the loss function.
  9. Second term is the KL divergence between prior probability distribution p and mean probability s bar. S bar is the mean probability in the training data, in the implantation , they approximinate it by batch. However, this approximinate can not treat a large number of classes and extreme imbalanced classes. This term will make the prediction of network follow the distribution p.
  10. The final term is the entropy of prediction of network. This term is requested for the training loss when we used soft label as label. With alpha and beta is zeros and we update the label by soft label. Both theta and label will be stuck in local optima and the learning process does not proceed. To overcome this problem, this term will concentrate the probability distribution of each soft label to a single class.
  11. By the experimiment, test accuracy remains high at the end of training when the learning rate is high.
  12. Symmetric noise cifar 10 is based on cifar10 dataset and ther are probability of r to changed the label of an image. Best, is the test accuracy on validation set. Last, is the test accuracy on testing set. There method reach the state of the art on CIFAR10. They also experiment their method on AN-CIFAR10 and PL-CIFAR and the performance is well.
  13. They use clothing1M dataset to examine the performance of their method in a real setting The images of this dataset are crawled from online shop and the label are generated by using the surrounding texts of the images on the website. noisy label is 61.54% Comparable performance on the clothing1M dataset.