SlideShare a Scribd company logo
NAACL 2018
*equal contributions
Multi-task Learning of
Pairwise Sequence
Classification Tasks Over
Disparate Label Spaces
Isabelle Augenstein*, Sebastian Ruder*,
Anders Søgaard
augenstein@di.ku.dk
@IAugenstein
http://isabelleaugenstein.github.io/
Problem
2
- Different NLU tasks (e.g. stance detection, aspect-based
sentiment analysis, natural language inference)
- Limited training data for most individual tasks
- However:
- they can be modelled with same base neural model
- they are semantically related
- they have similar labels
- How to exploit synergies between those tasks?
Datasets and Tasks
3
Topic-based sentiment analysis (2-way, 5-way)
negative, positive
highly negative, negative, neutral, positive, highly positive
Target-dependent sentiment analysis
negative, neutral, positive
Aspect-based sentiment analysis (Restaurants, Laptops)
negative, neutral, positive
Stance detection
against, none, favor
Fake news detection
disagree, unrelated, discuss, agree
Natural language inference
contradiction, neutral, entailment
Datasets Examples
4
Aspect-based sentiment analysis (Restaurants)
Text: For the price, you cannot eat this well in Manhattan
Aspect: restaurant prices
Label: positive
Natural language inference
Premise: Fun for only children
Hypothesis: Fun for adults and children
Label: contradiction
Multi-Task Learning
5
Multi-Task Learning
6
Separate inputs
for each task
Multi-Task Learning
7
Shared hidden
layers
Separate inputs
for each task
Multi-Task Learning
8
Shared hidden
layers
Separate inputs
for each task
Separate
output layers +
classification
functions
Multi-Task Learning
9
Shared hidden
layers
Separate inputs
for each task
Separate
output layers +
classification
functions
Negative log-
likelihood
objectives
Goal: Exploiting Synergies between Tasks
10
- Modelling tasks in a joint label space
- Label Transfer Network that learns to transfer labels
between tasks
- Use semi-supervised learning, trained end-to-end with
multi-task learning model
- Extensive evaluation on a set of pairwise sequence
classification tasks
Related Work
11
- Learning task similarities
- Clustering of labels
- Inducing shared prior
- Learning grouping
- Only works for tasks with same label spaces
- Label transformations
- Distributional information
- Correlation analysis
- Typically prior to model training
Related Work
12
- Multi-task learning with neural networks
- Hard parameter sharing
- Different sharing structures
- Does not take similarities between label spaces into
account
- Semi-supervised learning
- Self-training
- Tri-training
- Co-forest
- Assumes same label spaces
Model 1:
Label Embedding
Layer
20/06/2016 13
Multi-Task Learning
14
Shared hidden
layers
Separate inputs
for each task
Separate
output layers +
classification
functions
Negative log-
likelihood
objectives
Label Embedding Layer
15
Label Embedding Layer
16
Label
embedding
space
Prediction with label
compatibility function:
c(l, h) = l · h
Label Embeddings
17
Label Embeddings
- Model relationships between labels in the joint
embedding space
- Might lead to downstream improvements
- Crucial: the compatibility between an instance and any
label can now be measured
- This can be used to learn to transfer labels across
datasets
18
Model 2:
Label Transfer Network
20/06/2016 19
Label Transfer Network
Goal: learn to produce pseudo labels for target task
LTNT = MLP([o1, …, oT-1])
Li
oi = ∑ pj
Ti lj
j=1
- Output label embedding oi of task Ti: sum of the task’s
label embeddings lj weighted with their probability pj
Ti
- Trained on labelled target task data
- Negative log-likelihood objective LLTN to produce a
pseudo-label for the target task
20
Model 3:
Semi-Supervised MTL
20/06/2016 21
Semi-Supervised MTL
Goal: relabel aux task data as main task data using LTN
- LTN can be used to produce pseudo-labels for aux
or unlabelled instances
- Train the target task model on additional pseudo-
labelled data, added iteratively
- Additional loss: minimise the mean squared error
between the model predictions pTi and pseudo-labels
zTi produced by LTN
22
23
Best-Performing Aux Tasks
Main task Aux task
Topic-2 FNC-1, MultiNLI, Target
Topic-5 FNC-1, MultiNLI, ABSA-L, Target
Target FNC-1, MultiNLI, Topic-5
Stance FNC-1, MultiNLI, Target
ABSA-L Topic-5
ABSA-R Topic-5, ABSA-L, Target
FNC-1 Stance, MultiNLI, Topic-5, ABSA-R, Target
MultiNLI Topic-5
Trends:
• Target used by all Twitter main tasks
• Tasks with a higher number of labels (e.g. Topic-5) are used more often
• Tasks with more training data (FNC-1, MultiNLI) are used more often
Overall Results
24
Overall Results
25
Overall Results
26
Overall Results
27
Overall Results
28
Relabelling
29
Overall Results
- MTL models outperform STL models
- Label embeddings improve performance
- New SoA on topic-based sentiment analysis
- Learning to relabel data jointly with MTL tends to not
improve performance further
- Future work: use relabelling model to label unlabelled
data instead
30
Thank you!
augenstein@di.ku.dk
@IAugenstein
http://isabelleaugenstein.github.io/
20/06/2016 31
Datasets and Tasks
Topic-based sentiment analysis:
Tweet: No power at home, sat in the
dark listening to AC/DC in the hope it’ll
make the electricity come back again
Topic: AC/DC
Label: positive
Target-dependent sentiment
analysis:
Text: how do you like settlers of catan
for the wii?
Target: wii
Label: neutral
Aspect-based sentiment analysis:
Text: For the price, you cannot eat this
well in Manhattan
Aspects: restaurant prices, food quality
Label: positive
32
Stance detection:
Tweet: Be prepared - if we continue the
policies of the liberal left, we will be #Greece
Target: Donald Trump
Label: favor
Fake news detection:
Document: Dino Ferrari hooked the whopper
wels catfish, (...), which could be the biggest
in the world.
Headline: Fisherman lands 19 STONE
catfish which could be the biggest in the
world to be hooked
Label: agree
Natural language inference:
Premise: Fun for only children
Hypothesis: Fun for adults and children
Label: contradiction
Label Transfer Network (w or w/o semi-supervision)
33
Label Embedding Layer
34
Label
embedding
space
Prediction with label
compatibility function
c(·, ·) that measures
similarity between label
embedding l and hidden
representation h:
c(l, h) = l · h
Learning with Limited Labelled Data: Why?
35
General Challenges
- Manually annotating training data is expensive
- Only few large NLP datasets
- New tasks and domains
- Domain drift
Multilingual and Diversity Aspects
- Underrepresented languages
- Dialects
Learning with Limited Labelled Data: How?
36
- Domain Adaptation
- Weakly Supervised Learning
- Distant Supervision
- Transfer Learning
- Multi-Task Learning
- Unsupervised Learning

More Related Content

What's hot

Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis
csandit
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
Henock Beyene
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
How useful are semantic links for the detection of implicit references in csc...
How useful are semantic links for the detection of implicit references in csc...How useful are semantic links for the detection of implicit references in csc...
How useful are semantic links for the detection of implicit references in csc...
Traian Rebedea
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
Enrico Palumbo
 
Data cleaning and screening
Data cleaning and screeningData cleaning and screening
Data cleaning and screening
Hassan Hussein
 
resume_college
resume_collegeresume_college
resume_college
Gyliane Weisenfeld
 
Graph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationGraph-based Word Sense Disambiguation
Graph-based Word Sense Disambiguation
Elena-Oana Tabaranu
 
Cyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_alCyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_al
Victor Santos
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
pathsproject
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
Polytechnic University of Bari
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
Polytechnic University of Bari
 
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Polytechnic University of Bari
 

What's hot (14)

Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
How useful are semantic links for the detection of implicit references in csc...
How useful are semantic links for the detection of implicit references in csc...How useful are semantic links for the detection of implicit references in csc...
How useful are semantic links for the detection of implicit references in csc...
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
 
Data cleaning and screening
Data cleaning and screeningData cleaning and screening
Data cleaning and screening
 
resume_college
resume_collegeresume_college
resume_college
 
Graph-based Word Sense Disambiguation
Graph-based Word Sense DisambiguationGraph-based Word Sense Disambiguation
Graph-based Word Sense Disambiguation
 
Cyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_alCyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_al
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
 
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open DataSSSW 2013 - Feeding Recommender Systems with Linked Open Data
SSSW 2013 - Feeding Recommender Systems with Linked Open Data
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
Tutorial - Recommender systems meet linked open data - ICWE 2016 - Lugano - 0...
 

Similar to Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...
Infrrd
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
Parmeshwar Khurd
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
NBER
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
NBER
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
IDEAS - Int'l Data Engineering and Science Association
 
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
Eirini Ntoutsi
 
AINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, NikolenkoAINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, Nikolenko
Lidia Pivovarova
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
Eng Teong Cheah
 
8323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 20088323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 2008
untellectualism
 
Embedded based retrieval in modern search ranking system
Embedded based retrieval in modern search ranking systemEmbedded based retrieval in modern search ranking system
Embedded based retrieval in modern search ranking system
Marsan Ma
 
PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomies
Freddy Limpens
 
Homework 21. Complete Chapter 3, Problem #1 under Project.docx
Homework 21. Complete Chapter 3, Problem #1 under Project.docxHomework 21. Complete Chapter 3, Problem #1 under Project.docx
Homework 21. Complete Chapter 3, Problem #1 under Project.docx
adampcarr67227
 
Effective Unsupervised Matching of Product Titles
Effective Unsupervised Matching of Product TitlesEffective Unsupervised Matching of Product Titles
Effective Unsupervised Matching of Product Titles
Leonidas Akritidis
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph Mining
Sabri Skhiri
 
Active Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data ConstructionActive Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data Construction
Hiroshi Kajino
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
Prashant Mudgal
 
BAEB601 Chapter 4: Findings, Analysis, and SPSS
BAEB601 Chapter 4: Findings, Analysis, and SPSSBAEB601 Chapter 4: Findings, Analysis, and SPSS
BAEB601 Chapter 4: Findings, Analysis, and SPSS
Dr Nur Suhaili Ramli
 
Teaching with MATLAB
Teaching with MATLABTeaching with MATLAB
Teaching with MATLAB
SERC at Carleton College
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
Dr Arash Najmaei ( Phd., MBA, BSc)
 
Meetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdfMeetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdf
Deep Learning Italia
 

Similar to Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces (20)

Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
(Machine)Learning with limited labels(Machine)Learning with limited labels(Ma...
 
AINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, NikolenkoAINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, Nikolenko
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
8323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 20088323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 2008
 
Embedded based retrieval in modern search ranking system
Embedded based retrieval in modern search ranking systemEmbedded based retrieval in modern search ranking system
Embedded based retrieval in modern search ranking system
 
PhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomiesPhD defense : Multi-points of view semantic enrichment of folksonomies
PhD defense : Multi-points of view semantic enrichment of folksonomies
 
Homework 21. Complete Chapter 3, Problem #1 under Project.docx
Homework 21. Complete Chapter 3, Problem #1 under Project.docxHomework 21. Complete Chapter 3, Problem #1 under Project.docx
Homework 21. Complete Chapter 3, Problem #1 under Project.docx
 
Effective Unsupervised Matching of Product Titles
Effective Unsupervised Matching of Product TitlesEffective Unsupervised Matching of Product Titles
Effective Unsupervised Matching of Product Titles
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph Mining
 
Active Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data ConstructionActive Learning for Multi-relational Data Construction
Active Learning for Multi-relational Data Construction
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
BAEB601 Chapter 4: Findings, Analysis, and SPSS
BAEB601 Chapter 4: Findings, Analysis, and SPSSBAEB601 Chapter 4: Findings, Analysis, and SPSS
BAEB601 Chapter 4: Findings, Analysis, and SPSS
 
Teaching with MATLAB
Teaching with MATLABTeaching with MATLAB
Teaching with MATLAB
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Meetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdfMeetup Luglio - Operations Research.pdf
Meetup Luglio - Operations Research.pdf
 

More from Isabelle Augenstein

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
Isabelle Augenstein
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
Isabelle Augenstein
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
Isabelle Augenstein
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
Isabelle Augenstein
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
Isabelle Augenstein
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
Isabelle Augenstein
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
Isabelle Augenstein
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
Isabelle Augenstein
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
Isabelle Augenstein
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
Isabelle Augenstein
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Isabelle Augenstein
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Isabelle Augenstein
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
Isabelle Augenstein
 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
Isabelle Augenstein
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation Learning
Isabelle Augenstein
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Isabelle Augenstein
 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked Data
Isabelle Augenstein
 
Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured Text
Isabelle Augenstein
 

More from Isabelle Augenstein (20)

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation Learning
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked Data
 
Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured Text
 

Recently uploaded

Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024
Michael Price
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Improving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning ContentImproving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning Content
Enterprise Knowledge
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Alison B. Lowndes
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
KIRAN KV
 
Intel Unveils Core Ultra 200V Lunar chip .pdf
Intel Unveils Core Ultra 200V Lunar chip .pdfIntel Unveils Core Ultra 200V Lunar chip .pdf
Intel Unveils Core Ultra 200V Lunar chip .pdf
Tech Guru
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
OnBoard
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
AimanAthambawa1
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
Accelerating Migrations = Recommendations
Accelerating Migrations = RecommendationsAccelerating Migrations = Recommendations
Accelerating Migrations = Recommendations
isBullShit
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
UX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business GoalsUX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business Goals
FIDO Alliance
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
FIDO Alliance
 

Recently uploaded (20)

Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024Perth MuleSoft Meetup July 2024
Perth MuleSoft Meetup July 2024
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Improving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning ContentImproving Learning Content Efficiency with Reusable Learning Content
Improving Learning Content Efficiency with Reusable Learning Content
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
 
Intel Unveils Core Ultra 200V Lunar chip .pdf
Intel Unveils Core Ultra 200V Lunar chip .pdfIntel Unveils Core Ultra 200V Lunar chip .pdf
Intel Unveils Core Ultra 200V Lunar chip .pdf
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
Accelerating Migrations = Recommendations
Accelerating Migrations = RecommendationsAccelerating Migrations = Recommendations
Accelerating Migrations = Recommendations
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
UX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business GoalsUX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business Goals
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
 

Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

  • 1. NAACL 2018 *equal contributions Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces Isabelle Augenstein*, Sebastian Ruder*, Anders Søgaard augenstein@di.ku.dk @IAugenstein http://isabelleaugenstein.github.io/
  • 2. Problem 2 - Different NLU tasks (e.g. stance detection, aspect-based sentiment analysis, natural language inference) - Limited training data for most individual tasks - However: - they can be modelled with same base neural model - they are semantically related - they have similar labels - How to exploit synergies between those tasks?
  • 3. Datasets and Tasks 3 Topic-based sentiment analysis (2-way, 5-way) negative, positive highly negative, negative, neutral, positive, highly positive Target-dependent sentiment analysis negative, neutral, positive Aspect-based sentiment analysis (Restaurants, Laptops) negative, neutral, positive Stance detection against, none, favor Fake news detection disagree, unrelated, discuss, agree Natural language inference contradiction, neutral, entailment
  • 4. Datasets Examples 4 Aspect-based sentiment analysis (Restaurants) Text: For the price, you cannot eat this well in Manhattan Aspect: restaurant prices Label: positive Natural language inference Premise: Fun for only children Hypothesis: Fun for adults and children Label: contradiction
  • 8. Multi-Task Learning 8 Shared hidden layers Separate inputs for each task Separate output layers + classification functions
  • 9. Multi-Task Learning 9 Shared hidden layers Separate inputs for each task Separate output layers + classification functions Negative log- likelihood objectives
  • 10. Goal: Exploiting Synergies between Tasks 10 - Modelling tasks in a joint label space - Label Transfer Network that learns to transfer labels between tasks - Use semi-supervised learning, trained end-to-end with multi-task learning model - Extensive evaluation on a set of pairwise sequence classification tasks
  • 11. Related Work 11 - Learning task similarities - Clustering of labels - Inducing shared prior - Learning grouping - Only works for tasks with same label spaces - Label transformations - Distributional information - Correlation analysis - Typically prior to model training
  • 12. Related Work 12 - Multi-task learning with neural networks - Hard parameter sharing - Different sharing structures - Does not take similarities between label spaces into account - Semi-supervised learning - Self-training - Tri-training - Co-forest - Assumes same label spaces
  • 14. Multi-Task Learning 14 Shared hidden layers Separate inputs for each task Separate output layers + classification functions Negative log- likelihood objectives
  • 16. Label Embedding Layer 16 Label embedding space Prediction with label compatibility function: c(l, h) = l · h
  • 18. Label Embeddings - Model relationships between labels in the joint embedding space - Might lead to downstream improvements - Crucial: the compatibility between an instance and any label can now be measured - This can be used to learn to transfer labels across datasets 18
  • 19. Model 2: Label Transfer Network 20/06/2016 19
  • 20. Label Transfer Network Goal: learn to produce pseudo labels for target task LTNT = MLP([o1, …, oT-1]) Li oi = ∑ pj Ti lj j=1 - Output label embedding oi of task Ti: sum of the task’s label embeddings lj weighted with their probability pj Ti - Trained on labelled target task data - Negative log-likelihood objective LLTN to produce a pseudo-label for the target task 20
  • 22. Semi-Supervised MTL Goal: relabel aux task data as main task data using LTN - LTN can be used to produce pseudo-labels for aux or unlabelled instances - Train the target task model on additional pseudo- labelled data, added iteratively - Additional loss: minimise the mean squared error between the model predictions pTi and pseudo-labels zTi produced by LTN 22
  • 23. 23 Best-Performing Aux Tasks Main task Aux task Topic-2 FNC-1, MultiNLI, Target Topic-5 FNC-1, MultiNLI, ABSA-L, Target Target FNC-1, MultiNLI, Topic-5 Stance FNC-1, MultiNLI, Target ABSA-L Topic-5 ABSA-R Topic-5, ABSA-L, Target FNC-1 Stance, MultiNLI, Topic-5, ABSA-R, Target MultiNLI Topic-5 Trends: • Target used by all Twitter main tasks • Tasks with a higher number of labels (e.g. Topic-5) are used more often • Tasks with more training data (FNC-1, MultiNLI) are used more often
  • 30. Overall Results - MTL models outperform STL models - Label embeddings improve performance - New SoA on topic-based sentiment analysis - Learning to relabel data jointly with MTL tends to not improve performance further - Future work: use relabelling model to label unlabelled data instead 30
  • 32. Datasets and Tasks Topic-based sentiment analysis: Tweet: No power at home, sat in the dark listening to AC/DC in the hope it’ll make the electricity come back again Topic: AC/DC Label: positive Target-dependent sentiment analysis: Text: how do you like settlers of catan for the wii? Target: wii Label: neutral Aspect-based sentiment analysis: Text: For the price, you cannot eat this well in Manhattan Aspects: restaurant prices, food quality Label: positive 32 Stance detection: Tweet: Be prepared - if we continue the policies of the liberal left, we will be #Greece Target: Donald Trump Label: favor Fake news detection: Document: Dino Ferrari hooked the whopper wels catfish, (...), which could be the biggest in the world. Headline: Fisherman lands 19 STONE catfish which could be the biggest in the world to be hooked Label: agree Natural language inference: Premise: Fun for only children Hypothesis: Fun for adults and children Label: contradiction
  • 33. Label Transfer Network (w or w/o semi-supervision) 33
  • 34. Label Embedding Layer 34 Label embedding space Prediction with label compatibility function c(·, ·) that measures similarity between label embedding l and hidden representation h: c(l, h) = l · h
  • 35. Learning with Limited Labelled Data: Why? 35 General Challenges - Manually annotating training data is expensive - Only few large NLP datasets - New tasks and domains - Domain drift Multilingual and Diversity Aspects - Underrepresented languages - Dialects
  • 36. Learning with Limited Labelled Data: How? 36 - Domain Adaptation - Weakly Supervised Learning - Distant Supervision - Transfer Learning - Multi-Task Learning - Unsupervised Learning