SlideShare a Scribd company logo
1 of 12
Download to read offline
Multi-Label Classification with
Meta Labels
Jesse Read(1), Antti Puurula(2), Albert Bifet(3)
(1) Aalto University and HIIT, Finland
(2) University of Waikato, New Zealand
(3) Huawei Noah’s Ark Lab, Hong Kong
17 December 2014
Overview
Multi-label classification:
⊆ {beach, sunset, foliage, field, mountain, urban}
• Most multi-label classification methods can be expressed
in a general framework of meta-label classification
• Our work combines labels into meta-labels, so as to learn
dependence efficiently and effectively.
Multi-label Learning
With input variables X, produce predictions for multiple
output variables Y . The basic binary relevance approach,
YCYBYA
X5X4X3X2X1
• does not capture label dependence (among Y -variables)
• does not scale to large number of labels
The label powerset approach models label combinations as
class values in a multi-class problem.
Meta Labels
We introduce a layer of meta-labels,
YCYBYA
YB,CYA,B
X5X4X3X2X1
• captures label dependence
• meta-labels can be fewer than the original number of
labels
• and are deterministically decodable into the labels
Producing Meta Labels
dataset
instance labels
1 B
2 B,C
3 C
4 B
5 A,C
6 A,C
7 A,C
8 A,B,C
9 C
binary relevance
YA YB YC
0 1 0
0 1 1
0 0 1
0 1 0
1 0 1
1 0 1
1 0 1
1 1 1
0 0 1
label powerset
YA,B,C
B
BC
C
B
AC
AC
AC
ABC
C
meta labels
YA,C YB,C
∅ B
∅ BC
∅ C
∅ B
AC C
AC C
AC C
∅ BC
∅ C
• binary relevance: 9 exs, 3 × 2 binary classes
• label powerset: 9 exs, 1 × 5 multi-class
Producing Meta Labels
dataset
instance labels
1 B
2 B,C
3 C
4 B
5 A,C
6 A,C
7 A,C
8 A,B,C
9 C
binary relevance
YA YB YC
0 1 0
0 1 1
0 0 1
0 1 0
1 0 1
1 0 1
1 0 1
1 1 1
0 0 1
label powerset
YA,B,C
B
BC
C
B
AC
AC
AC
ABC
C
meta labels
YA,C YB,C
∅ B
∅ BC
∅ C
∅ B
AC C
AC C
AC C
∅ BC
∅ C
• binary relevance: 9 exs, 3 × 2 binary classes
• label powerset: 9 exs, 1 × 5 multi-class
• pruned meta labels: 9 exs, 2 meta labels, of 2 and 3 values
YAC ∈ {∅, AC}, YBC ∈ {B, C, BC}
(one possible formulation)
Example 2
There is no need to model labels together if there is no strong
dependence between them,
YCYBYA
YCYA,B
X1X2X3X4X5
YA,B ∈ {∅, B, AB}, YC ∈ {∅, C}
(e.g., no strong relation between C and the other labels)
General process for
classification with meta-labels
1 Make a partition (either overlapping or disjoint) of the
label set
2 Relabel the meta-labels, deciding on how many values
each label can take (i.e., possibly pruning some)
3 Train classifiers to predict meta-labels from the input
instances
4 Make predictions into the meta-label space
5 Recombine predictions into the label space
Voting Using Meta Labels
Table : Meta-label Vote, e.g., for YAB ∈ {∅, B, AB}
v A B p(YAB = v|˜x)
∅ 0 0 0.0
B 0 1 0.9
AB 1 1 0.1
PAB 0.1 1.0
Table : Labelset Voting: From Meta-labels to Labels
A B C
PAB 0.1 1.0
PBC 0.7 0.3
k Pk,j 0.1 1.7 0.3
ˆy (> 0.5) 0 1 0
Results
0 5 10 15 20 25 30 35 400.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
Accuracy
RAkEL
RAkELp
0 5 10 15 20 25 30 35 400
500
1000
1500
2000
Runningtime
RAkEL
RAkELp
Figure : On Enron (1700 emails, 53 labels). RAkEL: random
ensembles of ‘label-powerset method’ on subsets of size k
(horizontal axis) vs RAkELp: with pruned meta-labels
Summary
• General framework of meta-labels for multi-label
classification
• Unifies various approaches from the literature
• New models RAkELp and EpRd have large improvement
in running time
• Part of solution for LSHTC4 (1st place) and WISE (2nd
Place) Kaggle challenges
• Code available at
http://meka.sourceforge.net
Multi-Label Classification with
Meta Labels
Jesse Read(1), Antti Puurula(2), Albert Bifet(3)
(1) Aalto University and HIIT, Finland
(2) University of Waikato, New Zealand
(3) Huawei Noah’s Ark Lab, Hong Kong
17 December 2014

More Related Content

What's hot

PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)Learnbay Datascience
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionDADAJONJURAKUZIEV
 
Michal Erel's SIFT presentation
Michal Erel's SIFT presentationMichal Erel's SIFT presentation
Michal Erel's SIFT presentationwolf
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted treesNihar Ranjan
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learningShajun Nisha
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in RDuyen Do
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationVikas Jain
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationSridhar Nomula
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learningbutest
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic netKyusonLim
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
decision tree regression
decision tree regressiondecision tree regression
decision tree regressionAkhilesh Joshi
 

What's hot (20)

PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Michal Erel's SIFT presentation
Michal Erel's SIFT presentationMichal Erel's SIFT presentation
Michal Erel's SIFT presentation
 
Logistic regression
  Logistic regression  Logistic regression
Logistic regression
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted trees
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
Clustering
ClusteringClustering
Clustering
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classification
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learning
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Regularization and variable selection via elastic net
Regularization and variable selection via elastic netRegularization and variable selection via elastic net
Regularization and variable selection via elastic net
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Clustering
ClusteringClustering
Clustering
 
decision tree regression
decision tree regressiondecision tree regression
decision tree regression
 

Viewers also liked

MrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label ClassificationMrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label ClassificationYI-JHEN LIN
 
Tag Extraction Final Presentation - CS185CSpring2014
Tag Extraction Final Presentation - CS185CSpring2014Tag Extraction Final Presentation - CS185CSpring2014
Tag Extraction Final Presentation - CS185CSpring2014Naoki Nakatani
 
Multi-label, Multi-class Classification Using Polylingual Embeddings
Multi-label, Multi-class Classification Using Polylingual EmbeddingsMulti-label, Multi-class Classification Using Polylingual Embeddings
Multi-label, Multi-class Classification Using Polylingual EmbeddingsGeorge Balikas
 
Multi-Class Classification on Cartographic Data(Forest Cover)
Multi-Class Classification on Cartographic Data(Forest Cover)Multi-Class Classification on Cartographic Data(Forest Cover)
Multi-Class Classification on Cartographic Data(Forest Cover)Abhishek Agrawal
 
Voting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label ClassificationVoting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label ClassificationDaniele Loiacono
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health DataAbhishek Agrawal
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...Toshiki Sakai
 
Naïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments usingNaïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments usingNidhi Baranwal
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Data Classification Presentation
Data Classification PresentationData Classification Presentation
Data Classification PresentationDerroylo
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment AnalysisJaganadh Gopinadhan
 

Viewers also liked (15)

MrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label ClassificationMrKNN_Soft Relevance for Multi-label Classification
MrKNN_Soft Relevance for Multi-label Classification
 
Tag Extraction Final Presentation - CS185CSpring2014
Tag Extraction Final Presentation - CS185CSpring2014Tag Extraction Final Presentation - CS185CSpring2014
Tag Extraction Final Presentation - CS185CSpring2014
 
Multi-label, Multi-class Classification Using Polylingual Embeddings
Multi-label, Multi-class Classification Using Polylingual EmbeddingsMulti-label, Multi-class Classification Using Polylingual Embeddings
Multi-label, Multi-class Classification Using Polylingual Embeddings
 
Multi-Class Classification on Cartographic Data(Forest Cover)
Multi-Class Classification on Cartographic Data(Forest Cover)Multi-Class Classification on Cartographic Data(Forest Cover)
Multi-Class Classification on Cartographic Data(Forest Cover)
 
Voting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label ClassificationVoting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label Classification
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
 
Naïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments usingNaïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments using
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Text mining
Text miningText mining
Text mining
 
Spam Filtering
Spam FilteringSpam Filtering
Spam Filtering
 
Data Classification Presentation
Data Classification PresentationData Classification Presentation
Data Classification Presentation
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 

More from Albert Bifet

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream miningAlbert Bifet
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsAlbert Bifet
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themAlbert Bifet
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.Albert Bifet
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsAlbert Bifet
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataAlbert Bifet
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsAlbert Bifet
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsAlbert Bifet
 

More from Albert Bifet (20)

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid them
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data Streams
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data Streams
 

Recently uploaded

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 

Recently uploaded (20)

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 

Multi-label Classification with Meta-labels

  • 1. Multi-Label Classification with Meta Labels Jesse Read(1), Antti Puurula(2), Albert Bifet(3) (1) Aalto University and HIIT, Finland (2) University of Waikato, New Zealand (3) Huawei Noah’s Ark Lab, Hong Kong 17 December 2014
  • 2. Overview Multi-label classification: ⊆ {beach, sunset, foliage, field, mountain, urban} • Most multi-label classification methods can be expressed in a general framework of meta-label classification • Our work combines labels into meta-labels, so as to learn dependence efficiently and effectively.
  • 3. Multi-label Learning With input variables X, produce predictions for multiple output variables Y . The basic binary relevance approach, YCYBYA X5X4X3X2X1 • does not capture label dependence (among Y -variables) • does not scale to large number of labels The label powerset approach models label combinations as class values in a multi-class problem.
  • 4. Meta Labels We introduce a layer of meta-labels, YCYBYA YB,CYA,B X5X4X3X2X1 • captures label dependence • meta-labels can be fewer than the original number of labels • and are deterministically decodable into the labels
  • 5. Producing Meta Labels dataset instance labels 1 B 2 B,C 3 C 4 B 5 A,C 6 A,C 7 A,C 8 A,B,C 9 C binary relevance YA YB YC 0 1 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 1 1 0 1 1 1 1 0 0 1 label powerset YA,B,C B BC C B AC AC AC ABC C meta labels YA,C YB,C ∅ B ∅ BC ∅ C ∅ B AC C AC C AC C ∅ BC ∅ C • binary relevance: 9 exs, 3 × 2 binary classes • label powerset: 9 exs, 1 × 5 multi-class
  • 6. Producing Meta Labels dataset instance labels 1 B 2 B,C 3 C 4 B 5 A,C 6 A,C 7 A,C 8 A,B,C 9 C binary relevance YA YB YC 0 1 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 1 1 0 1 1 1 1 0 0 1 label powerset YA,B,C B BC C B AC AC AC ABC C meta labels YA,C YB,C ∅ B ∅ BC ∅ C ∅ B AC C AC C AC C ∅ BC ∅ C • binary relevance: 9 exs, 3 × 2 binary classes • label powerset: 9 exs, 1 × 5 multi-class • pruned meta labels: 9 exs, 2 meta labels, of 2 and 3 values YAC ∈ {∅, AC}, YBC ∈ {B, C, BC} (one possible formulation)
  • 7. Example 2 There is no need to model labels together if there is no strong dependence between them, YCYBYA YCYA,B X1X2X3X4X5 YA,B ∈ {∅, B, AB}, YC ∈ {∅, C} (e.g., no strong relation between C and the other labels)
  • 8. General process for classification with meta-labels 1 Make a partition (either overlapping or disjoint) of the label set 2 Relabel the meta-labels, deciding on how many values each label can take (i.e., possibly pruning some) 3 Train classifiers to predict meta-labels from the input instances 4 Make predictions into the meta-label space 5 Recombine predictions into the label space
  • 9. Voting Using Meta Labels Table : Meta-label Vote, e.g., for YAB ∈ {∅, B, AB} v A B p(YAB = v|˜x) ∅ 0 0 0.0 B 0 1 0.9 AB 1 1 0.1 PAB 0.1 1.0 Table : Labelset Voting: From Meta-labels to Labels A B C PAB 0.1 1.0 PBC 0.7 0.3 k Pk,j 0.1 1.7 0.3 ˆy (> 0.5) 0 1 0
  • 10. Results 0 5 10 15 20 25 30 35 400.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 Accuracy RAkEL RAkELp 0 5 10 15 20 25 30 35 400 500 1000 1500 2000 Runningtime RAkEL RAkELp Figure : On Enron (1700 emails, 53 labels). RAkEL: random ensembles of ‘label-powerset method’ on subsets of size k (horizontal axis) vs RAkELp: with pruned meta-labels
  • 11. Summary • General framework of meta-labels for multi-label classification • Unifies various approaches from the literature • New models RAkELp and EpRd have large improvement in running time • Part of solution for LSHTC4 (1st place) and WISE (2nd Place) Kaggle challenges • Code available at http://meka.sourceforge.net
  • 12. Multi-Label Classification with Meta Labels Jesse Read(1), Antti Puurula(2), Albert Bifet(3) (1) Aalto University and HIIT, Finland (2) University of Waikato, New Zealand (3) Huawei Noah’s Ark Lab, Hong Kong 17 December 2014