SlideShare a Scribd company logo
1 of 26
Download to read offline
Learning from Noisy Label Distributions
Yuya Yoshikawa
STAIR Lab,
Chiba Institute of Technology, Japan
Standard supervised learning setting
• Given labeled data 𝒙", 𝑦" "%&
'
• Feature vector 𝒙" ∈ ℝ*
• Label 𝑦" ∈ {1,2, … , 𝑀}
• Goal: to learn a classifier 𝑓 𝒙; 𝑾 , i.e., to estimate 𝑾
• We consider a linear classifier, i.e., 𝑓 𝒙; 𝑾 = 𝒙5 𝑾
where, weight matrix 𝑾 ∈ ℝ*×7
• Estimating 𝑾 needs a lot of labeled data
2
If we have no labeled data …
• Give up learning? → No.
• Annotate labels to unlabeled data by hand
• However, annotation is often difficult and expensive
3
A case that annotation is difficult
• Consider annotating age (e.g., 20s, 30s, 40s) to SNS users
• It’s very easy if the age is explicitly written in users’ profile
• If not, annotators need to infer users’ age from:
• Profile photos
• Texts (tweets etc.)
• Followers and followees
4
20s? 30s?
difficult…
Problem setting in this study
• Goal: to learn a classifier 𝑓(𝒙, 𝑾)
• Assumptions:
• There is no labeled data
• Each instance 𝒙" belongs to more than one groups
• Each group has a noisy label distribution which can be observed
• Our solution
• Infer the true label distributions of the groups from the noisy ones
• Infer the true label of each instance from the true label distributions
• Learn a classifier 𝑓(𝒙, 𝑾) using the true labels
5
Illustration of our setting
6
Illustration of our setting
7
• Feature vectors 𝒙: ∈ ℝ*
:%&
;
for 𝑈 instances
• Each instance 𝑢 has a single label 𝑦: ∈ 1, … , 𝑀 ,
(The shape of each instance indicates the label)
• But, the label cannot be observed
Illustration of our setting
8
• Each instance belongs to
more than one groups
• For each group, there is a true
label distribution (unobserved)
Illustration of our setting
9
• The true label distributions are
distorted by an unknown noise
• As a result, we can observe
the noisy label distributions
A typical example: Twitter
10
hyperlink
Twitter world BBC News website
@BBCWorld
male
Gender distribution of
the website visitors
(noisy label dist.)
female
50% 50%
Website world
male female
60% 40%
Gender distribution
(true label dist.)
distorted
by noise
A typical example: Twitter
11
Twitter world
@BBCWorld
male female
60% 40%
Gender distribution
(true label dist.)
• Goal: to learn a classifier that predicts
the gender of Twitter users
• Some users follows official accounts
such as @BBCWorld (BBC News)
• Each user is an instance
• @BBCWorld is a group
• Users who follows @BBCWorld
are the members of the group
• Gender distribution of @BBCWorld
cannot be observed
A typical example: Twitter
12
Twitter world BBC News website
@BBCWorld
male
Gender distribution of
the website visitors
(noisy label dist.)
female
50% 50%
Website world
male female
60% 40%
Gender distribution
(true label dist.)
distorted
by noise
hyperlink
• @BBCWorld has a hyperlink to
BBC News website
• The gender distribution of the
website visitors (noisy label dist.)
can be obtained from audience
measurement services such as
Quantcast
• Why is noise generated?
• Twitter and website worlds
have different populations
• Noise is used for conforming
the populations of two worlds
Problem setting in this study
• Goal: to learn a classifier 𝑓(𝒙, 𝑾)
• Assumptions:
• There is no labeled data
• Each instance 𝒙" belongs more than one groups
• Each group has a noisy label distribution which can be observed
• Our solution
• Infer the true label distributions of the groups from the noisy ones
• Infer the true label of each instance from the true label distributions
• Learn a classifier 𝑓(𝒙, 𝑾) using the inferred true labels
13
Related work
• Our study is inspired by [Cullota et al., AAAI 2015]
• Our setting is almost the same as theirs
• Their solution is too simple
• The solution cannot capture the difference between true and noisy label
distributions
14
𝒙
𝑓(𝒙, 𝑾)
Training
Learn a linear regression model 𝑓(𝒙, 𝑾) that
predict label ratios from a feature vector 𝒙
Prediction
𝒙>?@
𝑓(𝒙>?@, 𝑾)
Return a label that have the highest label ratio
predicted by 𝑓 𝒙, 𝑾
predicted ratios
△
label
Related work
15
• Our contributions
• Formalized the problem by Cullota et al. as a machine learning problem
• Proposed a probabilistic generative model specialized for the problem
• Our study is inspired by [Cullota et al., AAAI 2015]
• Our setting is almost the same as theirs
• Their solution is too simple
• The solution cannot capture the difference between true and noisy label
distributions
Proposed approach
• Developed a probabilistic generative model that represents the
generative process of the noisy label distributions
16
Graphical model
17
Weight matrix
for classifier
True label of
each instance
Confusion
matrix for noise
Noisy label distributions
of groups (observed)
Group-dependent label for
each instance and group
Feature vector for each
instance (observed)
Generative process
18
Generative process
19
𝜷 ∈ ℝ7×7
is determined by
When 𝛼CD > 𝛼C&
Assume strong noise
When 𝛼C& > 𝛼CD
Assume weak noise
Generative process
20
Generative process
21
𝑡:"
Generative process
22
Inference: variational Bayes method
23
Objective function:
log of marginal posterior w.r.t. weight matrix 𝐖	and confusion matrix 𝐂
Goal: find 𝐖 and 𝐂 such that the objective function is maximized
• Mean-field approximation is applied to the objective for efficient computation
• Then, we estimated W and C by using a quasi-Newton method
Experimental setting
• We experimented on a synthetic dataset
• The dataset is generated based on the proposed model
• The purpose is to confirm that the proposed model is superior to the existing
methods when the label distributions are distorted by a noise.
• We created three datasets varying hyper-parameter 𝛼C& ∈ {1,10,100}
• The hyper-parameter controls the strength of noise distortion
• When 𝛼C&=1, noise is small, i.e., the difference between true and noisy label
distributions is small
• When 𝛼C&=100, noise is large, i.e., the difference between true and noisy label
distributions is large
24
Result
• Regardless of noise strength, the proposed model is consistently
superior to the methods proposed by [Cullota et al., AAAI 2015]
25
Table: Accuracy of true label estimation (# classes 𝑀 = 4)
Methods proposed by
[Cullota et al., AAAI 2015]
strong noiseweak noise
Conclusion and future work
• We addressed the problem of learning a classifier from noisy
label distributions
• There is no labeled data
• Instead, each instance belongs to more than one groups, and then,
each group has a noisy label distribution
• To solve this problem, we proposed a probabilistic generative model
• Future work
• Experiments on real-world datasets
26

More Related Content

What's hot

Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Arjen de Vries
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie RecommendationYONG ZHENG
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Tweets Classification
Tweets ClassificationTweets Classification
Tweets ClassificationVarun Gupta
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...NAIST Machine Translation Study Group
 
CSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 MarCSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 Marcstalks
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesCSIRO
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scalehuguk
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques sun9413
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 

What's hot (20)

Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
 
Quantifying the bias in data links
Quantifying the bias in data linksQuantifying the bias in data links
Quantifying the bias in data links
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Tweets Classification
Tweets ClassificationTweets Classification
Tweets Classification
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...
 
CSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 MarCSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 Mar
 
Getting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensemblesGetting better at detecting anomalies by using ensembles
Getting better at detecting anomalies by using ensembles
 
Active learning
Active learningActive learning
Active learning
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Collaborative filtering at scale
Collaborative filtering at scaleCollaborative filtering at scale
Collaborative filtering at scale
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 

Similar to Learning from Noisy Label Distributions (ICANN2017)

Data Science 101
Data Science 101Data Science 101
Data Science 101ideatoipo
 
Networks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationNetworks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationAboul Ella Hassanien
 
Approaches to ml techniques on real world data
Approaches to ml techniques on real world dataApproaches to ml techniques on real world data
Approaches to ml techniques on real world dataVenkata Ramana
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiVijay Susheedran C G
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introductionNeeraj Tewari
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsAndrea Arcuri
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummiesSaurav Chakravorty
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Maninda Edirisooriya
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsKimin Lee
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13Chris Lovett
 
11-Statistical-Tests.pptx
11-Statistical-Tests.pptx11-Statistical-Tests.pptx
11-Statistical-Tests.pptxShree Shree
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxCSIRO
 
Nearest Neighbor And Decision Tree - NN DT
Nearest Neighbor And Decision Tree - NN DTNearest Neighbor And Decision Tree - NN DT
Nearest Neighbor And Decision Tree - NN DTjulianaantunes58
 
probability.pptx
probability.pptxprobability.pptx
probability.pptxbisan3
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Rajul Kukreja
 
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...Jihwan Bang
 

Similar to Learning from Noisy Label Distributions (ICANN2017) (20)

Mini datathon
Mini datathonMini datathon
Mini datathon
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Networks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimizationNetworks community detection using artificial bee colony swarm optimization
Networks community detection using artificial bee colony swarm optimization
 
Approaches to ml techniques on real world data
Approaches to ml techniques on real world dataApproaches to ml techniques on real world data
Approaches to ml techniques on real world data
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Big Data Real Time Training in Chennai
Big Data Real Time Training in ChennaiBig Data Real Time Training in Chennai
Big Data Real Time Training in Chennai
 
Big Data 101 - An introduction
Big Data 101 - An introductionBig Data 101 - An introduction
Big Data 101 - An introduction
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummies
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13
 
11-Statistical-Tests.pptx
11-Statistical-Tests.pptx11-Statistical-Tests.pptx
11-Statistical-Tests.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Explainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptxExplainable algorithm evaluation.pptx
Explainable algorithm evaluation.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Nearest Neighbor And Decision Tree - NN DT
Nearest Neighbor And Decision Tree - NN DTNearest Neighbor And Decision Tree - NN DT
Nearest Neighbor And Decision Tree - NN DT
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...
 
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 

Learning from Noisy Label Distributions (ICANN2017)

  • 1. Learning from Noisy Label Distributions Yuya Yoshikawa STAIR Lab, Chiba Institute of Technology, Japan
  • 2. Standard supervised learning setting • Given labeled data 𝒙", 𝑦" "%& ' • Feature vector 𝒙" ∈ ℝ* • Label 𝑦" ∈ {1,2, … , 𝑀} • Goal: to learn a classifier 𝑓 𝒙; 𝑾 , i.e., to estimate 𝑾 • We consider a linear classifier, i.e., 𝑓 𝒙; 𝑾 = 𝒙5 𝑾 where, weight matrix 𝑾 ∈ ℝ*×7 • Estimating 𝑾 needs a lot of labeled data 2
  • 3. If we have no labeled data … • Give up learning? → No. • Annotate labels to unlabeled data by hand • However, annotation is often difficult and expensive 3
  • 4. A case that annotation is difficult • Consider annotating age (e.g., 20s, 30s, 40s) to SNS users • It’s very easy if the age is explicitly written in users’ profile • If not, annotators need to infer users’ age from: • Profile photos • Texts (tweets etc.) • Followers and followees 4 20s? 30s? difficult…
  • 5. Problem setting in this study • Goal: to learn a classifier 𝑓(𝒙, 𝑾) • Assumptions: • There is no labeled data • Each instance 𝒙" belongs to more than one groups • Each group has a noisy label distribution which can be observed • Our solution • Infer the true label distributions of the groups from the noisy ones • Infer the true label of each instance from the true label distributions • Learn a classifier 𝑓(𝒙, 𝑾) using the true labels 5
  • 7. Illustration of our setting 7 • Feature vectors 𝒙: ∈ ℝ* :%& ; for 𝑈 instances • Each instance 𝑢 has a single label 𝑦: ∈ 1, … , 𝑀 , (The shape of each instance indicates the label) • But, the label cannot be observed
  • 8. Illustration of our setting 8 • Each instance belongs to more than one groups • For each group, there is a true label distribution (unobserved)
  • 9. Illustration of our setting 9 • The true label distributions are distorted by an unknown noise • As a result, we can observe the noisy label distributions
  • 10. A typical example: Twitter 10 hyperlink Twitter world BBC News website @BBCWorld male Gender distribution of the website visitors (noisy label dist.) female 50% 50% Website world male female 60% 40% Gender distribution (true label dist.) distorted by noise
  • 11. A typical example: Twitter 11 Twitter world @BBCWorld male female 60% 40% Gender distribution (true label dist.) • Goal: to learn a classifier that predicts the gender of Twitter users • Some users follows official accounts such as @BBCWorld (BBC News) • Each user is an instance • @BBCWorld is a group • Users who follows @BBCWorld are the members of the group • Gender distribution of @BBCWorld cannot be observed
  • 12. A typical example: Twitter 12 Twitter world BBC News website @BBCWorld male Gender distribution of the website visitors (noisy label dist.) female 50% 50% Website world male female 60% 40% Gender distribution (true label dist.) distorted by noise hyperlink • @BBCWorld has a hyperlink to BBC News website • The gender distribution of the website visitors (noisy label dist.) can be obtained from audience measurement services such as Quantcast • Why is noise generated? • Twitter and website worlds have different populations • Noise is used for conforming the populations of two worlds
  • 13. Problem setting in this study • Goal: to learn a classifier 𝑓(𝒙, 𝑾) • Assumptions: • There is no labeled data • Each instance 𝒙" belongs more than one groups • Each group has a noisy label distribution which can be observed • Our solution • Infer the true label distributions of the groups from the noisy ones • Infer the true label of each instance from the true label distributions • Learn a classifier 𝑓(𝒙, 𝑾) using the inferred true labels 13
  • 14. Related work • Our study is inspired by [Cullota et al., AAAI 2015] • Our setting is almost the same as theirs • Their solution is too simple • The solution cannot capture the difference between true and noisy label distributions 14 𝒙 𝑓(𝒙, 𝑾) Training Learn a linear regression model 𝑓(𝒙, 𝑾) that predict label ratios from a feature vector 𝒙 Prediction 𝒙>?@ 𝑓(𝒙>?@, 𝑾) Return a label that have the highest label ratio predicted by 𝑓 𝒙, 𝑾 predicted ratios △ label
  • 15. Related work 15 • Our contributions • Formalized the problem by Cullota et al. as a machine learning problem • Proposed a probabilistic generative model specialized for the problem • Our study is inspired by [Cullota et al., AAAI 2015] • Our setting is almost the same as theirs • Their solution is too simple • The solution cannot capture the difference between true and noisy label distributions
  • 16. Proposed approach • Developed a probabilistic generative model that represents the generative process of the noisy label distributions 16
  • 17. Graphical model 17 Weight matrix for classifier True label of each instance Confusion matrix for noise Noisy label distributions of groups (observed) Group-dependent label for each instance and group Feature vector for each instance (observed)
  • 19. Generative process 19 𝜷 ∈ ℝ7×7 is determined by When 𝛼CD > 𝛼C& Assume strong noise When 𝛼C& > 𝛼CD Assume weak noise
  • 23. Inference: variational Bayes method 23 Objective function: log of marginal posterior w.r.t. weight matrix 𝐖 and confusion matrix 𝐂 Goal: find 𝐖 and 𝐂 such that the objective function is maximized • Mean-field approximation is applied to the objective for efficient computation • Then, we estimated W and C by using a quasi-Newton method
  • 24. Experimental setting • We experimented on a synthetic dataset • The dataset is generated based on the proposed model • The purpose is to confirm that the proposed model is superior to the existing methods when the label distributions are distorted by a noise. • We created three datasets varying hyper-parameter 𝛼C& ∈ {1,10,100} • The hyper-parameter controls the strength of noise distortion • When 𝛼C&=1, noise is small, i.e., the difference between true and noisy label distributions is small • When 𝛼C&=100, noise is large, i.e., the difference between true and noisy label distributions is large 24
  • 25. Result • Regardless of noise strength, the proposed model is consistently superior to the methods proposed by [Cullota et al., AAAI 2015] 25 Table: Accuracy of true label estimation (# classes 𝑀 = 4) Methods proposed by [Cullota et al., AAAI 2015] strong noiseweak noise
  • 26. Conclusion and future work • We addressed the problem of learning a classifier from noisy label distributions • There is no labeled data • Instead, each instance belongs to more than one groups, and then, each group has a noisy label distribution • To solve this problem, we proposed a probabilistic generative model • Future work • Experiments on real-world datasets 26