SlideShare a Scribd company logo
1 of 54
AN EXPLORATION OF NON
LABEL-PRESERVING DATA
AUGMENTATIONS
Jonathan Zarecki
To appear in IJCAI20 as “Textual Membership Queries”
About me
◦ Jonathan Zarecki
◦ MSc in ML & Active Learning with Prof. Shaul
Markovitch (Technion)
◦ Currently pursuing a Phd in CS with Prof. Gal Chechik
(BIU & Nvidia).
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦ Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for
active-learning.
◦ Empirical evaluation of this method on several datasets
Data
Augmentations
(quickly)
Traditional
augmentations ensure
that the label remains
constant
This is a strong
limitations of the
operations we can
perform !
Data
Augmentations
(quickly)
Now with text
Batman is really
awesome
is really awesome
Batman is really not
awesome
Awesome is really
Batman
Batman is really great
Random
deletion
Random
Insertion
Random
Switch
Synonym
Replacemen
t
EDA – Wei & Zou (EMNLP 19)
Data
Augmentations
(quickly)
Now with text
In textual
augmentations it’s not
always trivial to keep
the sentence valid or
readable.
Batman is really
awesome
is really awesome
Batman is really not
awesome
Awesome is really
Batman
Batman is really great
Random
deletion
Random
Insertion
Random
Switch
Synonym
Replacemen
t
EDA – Wei & Zou (EMNLP 19)
Non-Restrictive Textual Augmentations
◦ What will happen if we let loose ? Apply any augmentation we want ?
My favorite movie so far
My computer favorite movie
so far
Add computer
My computer favorite movie
so
Remove far
Non-Restrictive Textual Augmentations
◦ LSTMs ?
So you’re telling me
Continue with
LSTM
Non-Restrictive Textual Augmentations
◦ LSTMs ?
Sometimes they’re pretty good tho
One does not simply
Continue with
LSTM
This meme does not exist - Imgflip
◦ But Let’s leave unreadable sentences aside.
◦ Another important property of using more expressive augmentations is that
the label might change !
Batman is really
awesome
Batman is really bad
Non-Restrictive Textual Augmentations
Non Label-Preserving (LP) Augmentations
We want augmentations which will:
1. Change the sentence’s meaning significantly
2. Keep the sentence fully readable
(Somewhat) Unlike image augmentations
Using more expressive textual augmentation have the risk to make the resulting sentence
gibberish or completely change it’s label
Not knowing an example’s label we arrive at the field of active-learning
Overview
◦Potential problems of traditional data augmentations
in text
◦ Quick overview of active-learning
◦ Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for
active-learning.
◦ Empirical evaluation of this method on several datasets
Overview
◦ Potential problems of traditional data augmentations in text
◦Quick overview of active-learning
◦ Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for
active-learning.
◦ Empirical evaluation of this method on several datasets
Training
Active-Learning – Quick Overview
Labeled
Unlabeled
pool
Labeled
Inference
Active-Learning – Quick Overview
Unlabeled
pool
Labeled
Inference
Labeling
If uncertain,
Active-Learning – Quick Overview
Hi ! I’m a data
labeler
Unlabeled
pool
Inference
Labeling
If uncertain,
Active-Learning – Quick Overview
Training
Labeled
Hi ! I’m a data
labeler
Unlabeled
pool
Labeled
Inference
Training
Labeling
If uncertain,
Active-Learning – Quick Overview
Hi ! I’m a data
labeler
If uncertain,
The key of active learning is how to
measure the uncertainty.
Active-Learning – Quick Overview
Unlabeled
Pool
Inference
Labeling
What We’ll be Doing
If uncertain,
Labeled
Training
Generated
pool
Inference
Labeling
What We’ll be Doing
If uncertain,
Labeled
Training
Augment Without any unlabeled
data !
Overview
◦ Potential problems of traditional data augmentations in text
◦Quick overview of active-learning
◦ Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for
active-learning.
◦ Empirical evaluation of this method on several datasets
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for
active-learning.
◦ Empirical evaluation of this method on several datasets
Why are textual modifications hard?
◦ When sentences are not built carefully they can easily become unreadable:
◦ Sentences has to comply to syntactic rules.
◦ But also to semantic rules
Took I the dog
to
Does not comply to
syntactic rules
I ate a book for breakfast
Does not “make sense”
Modification Operators Definition
◦ First we find all “replaceable word” in the sentence
◦ Nouns, verbs & adjectives
◦ For each replaceable word we look at the
knowledge-base and find words to replace it.
◦ All options returned are the modification operators for a
given sentence.
I hate all the catsI hate all the cats
hate
despise
adore
dislike
detest
cats
Dogs
Wolves
Lions
Pigs
So how does it look ?
I hate all the cats
I hate all the dogs
I despise all the cats
I adore all the cats
I adore all the dogs
Hate
Speech
I hate all the cats
Non-Hate
Speech
Semantic Knowledge-bases
◦ In order for our modification operators to work we need to find meaningful replacements.
◦ Replacements should be functionally similar – behave the same as the replaced word
◦ We need a knowledge-base where we can find such words.
◦ Options for this include: word2vec, WordNet and more.
◦ We chose “Dependency Word2vec” (Levy & Goldberg et al. 2014) as our
knowledge base
Dependency w2v – ACL 2014
Qualitative analysis of the knowledge-
bases
Dependency Word2vec (Levy & Goldberg 2014)
Introduces a subtle change in the word2vec context:
Australian scientist discovers star withtelescope
Australian scientist discoversstar with telescope
prep_withnsubj
dobj
Word2vec:
Dependency
word2vec:
Qualitative analysis of the knowledge-
bases
Dependency Word2vec (Levy & Goldberg 2014)
Functional similarity is exhibited very well in dep w2v.
w2v
dumbledore
hallows
half-blood
malfoy
snape
Dep w2v
sunnydale
collinwood
calarts
greendale
millfield
hogwarts
Related to
Harry Potter
Schools
Full example of modification operators
Batman is really awesomeBatman is really awesome
Batman
superman
superboy
supergirl
catwoman
aquaman
awesome
terrific
marvelous
wonderful
lousy
awful
Further analysis of 4 different
knowledge-bases can be seen in
the full paper.
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for active-learning.
◦ Empirical evaluation of this method on several datasets
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦ Definition of new textual modification operators
◦Applying heuristic-search with modification
operators for active-learning.
◦ Empirical evaluation of this method on several datasets
Stochastic Synthesis Algorithm
◦ A simple way to use the operators is just applying them randomly.
◦ Until enough instances have been generated do:
1. Randomly choose an instance from the available examples
2. Apply a random operators to it
3. Return as new MQ
Examples
𝜙1
𝜙1
1 𝜙1
2
𝜙1
3𝜙1
3
𝜙2
𝜙2
1 𝜙2
2
𝜙2
3
𝜙2
3
New examples
Using search algorithms to generate
examples
◦ Repeatedly applying these
operators gives us many
options
◦ Using search algorithms we
can actively look for the
most informative examples
◦ But how do we
direct the search ?
𝜙1
𝜙1
1
𝜙1
2
𝜙1
3
= 𝜙2
𝜙2
1
= 𝜙3 𝜙2
2
𝜙2
3
𝜙3
1 𝜙3
2
𝜙3
3
Active learning - throwback
Active learning - throwback
Search Heuristic Function
◦ To direct the search we need a function that gives higher score to more informative
instances.
◦ We used existing active learning functions to give higher score to more informative
examples:
◦ Uncertainty sampling (Lewis & Gale, 1994)
◦ Expected model change (Lindenbaum, Markovitch, & Rusakov, 2004)
Heuristic-Search Generation
◦ Similar to the stochastic approach, but apply a search alg’ to pick the best example
◦ Until enough instances have been generated do:
1. Randomly choose an instance from the available examples
2. Run a heuristic search on that instance
3. Return as a new example
Examples
𝜙3
1 𝜙3
2
𝜙3
3
New examples
𝜙2
1
= 𝜙3 𝜙2
2 𝜙2
3
𝜙1
𝜙1
1
𝜙1
2 𝜙1
3
= 𝜙2
𝜙2
3
Uncertainty sampling
directs the search
Heuristic-Search Generation
◦ Similar to the stochastic approach, but apply a search alg’ to pick the best example
◦ Until enough instances have been generated do:
1. Randomly choose an instance from the available examples
2. Run a heuristic search on that instance
3. Return as a new example
Examples
𝜙′3
1 𝜙′3
2
𝜙′3
3
New examples
𝜙′2
1
= 𝜙′3 𝜙′2
2 𝜙′2
3
𝜙′1
𝜙′1
1
𝜙′1
2 𝜙′1
3
= 𝜙′2
𝜙′3
3
𝜙2
3
Uncertainty sampling
directs the search
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦ Definition of new textual modification operators
◦Applying heuristic-search with modification
operators for active-learning.
◦ Empirical evaluation of this method on several datasets
Overview
◦ Potential problems of traditional data augmentations in text
◦ Quick overview of active-learning
◦ Definition of new textual modification operators
◦ Applying heuristic-search with modification operators for active-learning.
◦ Empirical evaluation of this method on several datasets
Sentence Quality – Human Evaluation
◦ We already talked about how generating a readable sentence can be hard, are these operators
comply with that ?
◦ We randomly chose 1000 sentences from each category, and asked
“Is this sentence fully readable to you ?”
Original Sentences (96%)
Yes No
HS Sentences (95%)
Yes No
Wikipedia LSTM
Sentences (21%)
Yes No
Labeled
Training
Starting with 10 examples
Experiment 1 – Batch Active-Learning
Unlabeled
pool
Inference
Labeling
Experiment 1 – Batch Active-Learning
If uncertain,
Labeled
Starting with 10 examples
Training
Generated
pool
Inference
Labeling
Experiment 1 – Batch Active-Learning
If uncertain,
Labeled
Starting with 10 examples
Training
Augment Without any unlabeled
data !
Datasets
◦ Sentiment Analysis:
◦ CMR: Cornell sentiment polarity dataset
◦ SST: Stanford sentiment treebank, a sentence sentiment analysis dataset
◦ KS: A Kaggle short sentence sentiment analysis dataset
◦ Subjectivity/Objectivity Detection
◦ SUBJ: Cornell sentence subjective / objective dataset
◦ Offensive-language and Hate-speech Detection:
◦ HS: Hate speech and offensive language classification dataset
Compared Methods
◦ Our methods:
◦ Uncertainty sampling Hill-climbing MQ synthesis (US-HC-MQ)
◦ Uncertainty sampling Beam-search MQ synthesis (US-BS-MQ)
◦ Stochastic Synthesis (S-MQ)
◦ Competitor Methods:
◦ WordNet-based Synonym-replacement (WNA) (Lecun et al. 2016)
◦ Original examples (IDEAL)
◦ LSTM Generator (RNN) – pretrained on English Wikipedia
Uses unlabeled
data
Results –
Experiment 1
• We can see that our methods
have consistently improved
the initial accuracy
• The search-based method
are superior among almost all
datasets
Labeled
Training
Starting with 10 examples
Experiment 2 - Measuring Label Switch
Generated
pool
Inference
Labeled
Starting with 10 examples
Training
Augment
Experiment 2 - Measuring Label Switch
Choose 50 most
informative
How many
switched their
original label ?
◦ We compare 3 synthesis algorithms:
1. Uncertainty hill-climbing (search-based generation)
2. Stochastic hill-climbing (multiple random operators)
3. Stochastic Synthesis
Experiment 2 - Measuring Label Switch
% of switched instances
What did we see ?
◦ Potential problems of non-restrictive augmentations in text
◦ Definition of new modification operators (= non-label preserving augmentations)
in the textual domain.
◦ Using heuristic-search with modification operators for generating new examples
for active-learning.
◦ Empirical evaluation of this method on several datasets
I want to thank you
for coming !
I want to thank you
for arriving !
I would-like to thank
you for coming !
I want to condemn
you for coming !
I want to thank you
for going !
Thanks-for-coming
sentence
I want to thank you
for coming !
Thank You !
Questions ?

More Related Content

Similar to AN EXPLORATION OF NON-LABEL-PRESERVING DATA AUGMENTATIONS FOR ACTIVE LEARNING

Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
 
Test specifications and designs
Test specifications and designs  Test specifications and designs
Test specifications and designs ahfameri
 
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...edwinray3
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Anubhav Jain
 
Michael Bolton - Heuristics: Solving Problems Rapidly
Michael Bolton - Heuristics: Solving Problems RapidlyMichael Bolton - Heuristics: Solving Problems Rapidly
Michael Bolton - Heuristics: Solving Problems RapidlyTEST Huddle
 
Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Anirudh Jayakumar
 
Presentation Toolkit for Business Work
Presentation Toolkit for Business WorkPresentation Toolkit for Business Work
Presentation Toolkit for Business WorkGlax Graces
 
Creativity vs Best Practices
Creativity vs Best PracticesCreativity vs Best Practices
Creativity vs Best PracticesSupun Dissanayake
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkSimon Hughes
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Cloudera, Inc.
 
Data Acquisition for Sentiment Analysis
Data Acquisition for Sentiment AnalysisData Acquisition for Sentiment Analysis
Data Acquisition for Sentiment AnalysisAli BELCAID
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwittersLiu Chang
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Modeltaeseon ryu
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfDevinSohi
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibilityc.titus.brown
 

Similar to AN EXPLORATION OF NON-LABEL-PRESERVING DATA AUGMENTATIONS FOR ACTIVE LEARNING (20)

Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
 
Test specifications and designs session 4
Test specifications and designs  session 4Test specifications and designs  session 4
Test specifications and designs session 4
 
Test specifications and designs
Test specifications and designs  Test specifications and designs
Test specifications and designs
 
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...
S.N.Sivanandam & S.N. Deepa - Introduction to Genetic Algorithms 2008 ISBN 35...
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...
 
Michael Bolton - Heuristics: Solving Problems Rapidly
Michael Bolton - Heuristics: Solving Problems RapidlyMichael Bolton - Heuristics: Solving Problems Rapidly
Michael Bolton - Heuristics: Solving Problems Rapidly
 
Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution
 
Presentation Toolkit for Business Work
Presentation Toolkit for Business WorkPresentation Toolkit for Business Work
Presentation Toolkit for Business Work
 
Creativity vs Best Practices
Creativity vs Best PracticesCreativity vs Best Practices
Creativity vs Best Practices
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 
CPP12 - Algorithms
CPP12 - AlgorithmsCPP12 - Algorithms
CPP12 - Algorithms
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 
Data Acquisition for Sentiment Analysis
Data Acquisition for Sentiment AnalysisData Acquisition for Sentiment Analysis
Data Acquisition for Sentiment Analysis
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwitters
 
Eskm20140903
Eskm20140903Eskm20140903
Eskm20140903
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Model
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdf
 
2014 nicta-reproducibility
2014 nicta-reproducibility2014 nicta-reproducibility
2014 nicta-reproducibility
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

AN EXPLORATION OF NON-LABEL-PRESERVING DATA AUGMENTATIONS FOR ACTIVE LEARNING

  • 1. AN EXPLORATION OF NON LABEL-PRESERVING DATA AUGMENTATIONS Jonathan Zarecki To appear in IJCAI20 as “Textual Membership Queries”
  • 2. About me ◦ Jonathan Zarecki ◦ MSc in ML & Active Learning with Prof. Shaul Markovitch (Technion) ◦ Currently pursuing a Phd in CS with Prof. Gal Chechik (BIU & Nvidia).
  • 3. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦ Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 4. Data Augmentations (quickly) Traditional augmentations ensure that the label remains constant This is a strong limitations of the operations we can perform !
  • 5. Data Augmentations (quickly) Now with text Batman is really awesome is really awesome Batman is really not awesome Awesome is really Batman Batman is really great Random deletion Random Insertion Random Switch Synonym Replacemen t EDA – Wei & Zou (EMNLP 19)
  • 6. Data Augmentations (quickly) Now with text In textual augmentations it’s not always trivial to keep the sentence valid or readable. Batman is really awesome is really awesome Batman is really not awesome Awesome is really Batman Batman is really great Random deletion Random Insertion Random Switch Synonym Replacemen t EDA – Wei & Zou (EMNLP 19)
  • 7. Non-Restrictive Textual Augmentations ◦ What will happen if we let loose ? Apply any augmentation we want ? My favorite movie so far My computer favorite movie so far Add computer My computer favorite movie so Remove far
  • 8. Non-Restrictive Textual Augmentations ◦ LSTMs ? So you’re telling me Continue with LSTM
  • 9. Non-Restrictive Textual Augmentations ◦ LSTMs ? Sometimes they’re pretty good tho One does not simply Continue with LSTM This meme does not exist - Imgflip
  • 10. ◦ But Let’s leave unreadable sentences aside. ◦ Another important property of using more expressive augmentations is that the label might change ! Batman is really awesome Batman is really bad Non-Restrictive Textual Augmentations
  • 11. Non Label-Preserving (LP) Augmentations We want augmentations which will: 1. Change the sentence’s meaning significantly 2. Keep the sentence fully readable (Somewhat) Unlike image augmentations Using more expressive textual augmentation have the risk to make the resulting sentence gibberish or completely change it’s label Not knowing an example’s label we arrive at the field of active-learning
  • 12. Overview ◦Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦ Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 13. Overview ◦ Potential problems of traditional data augmentations in text ◦Quick overview of active-learning ◦ Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 17. Unlabeled pool Inference Labeling If uncertain, Active-Learning – Quick Overview Training Labeled Hi ! I’m a data labeler
  • 19. If uncertain, The key of active learning is how to measure the uncertainty. Active-Learning – Quick Overview
  • 20. Unlabeled Pool Inference Labeling What We’ll be Doing If uncertain, Labeled Training
  • 21. Generated pool Inference Labeling What We’ll be Doing If uncertain, Labeled Training Augment Without any unlabeled data !
  • 22. Overview ◦ Potential problems of traditional data augmentations in text ◦Quick overview of active-learning ◦ Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 23. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 24. Why are textual modifications hard? ◦ When sentences are not built carefully they can easily become unreadable: ◦ Sentences has to comply to syntactic rules. ◦ But also to semantic rules Took I the dog to Does not comply to syntactic rules I ate a book for breakfast Does not “make sense”
  • 25. Modification Operators Definition ◦ First we find all “replaceable word” in the sentence ◦ Nouns, verbs & adjectives ◦ For each replaceable word we look at the knowledge-base and find words to replace it. ◦ All options returned are the modification operators for a given sentence. I hate all the catsI hate all the cats hate despise adore dislike detest cats Dogs Wolves Lions Pigs
  • 26. So how does it look ? I hate all the cats I hate all the dogs I despise all the cats I adore all the cats I adore all the dogs Hate Speech I hate all the cats Non-Hate Speech
  • 27. Semantic Knowledge-bases ◦ In order for our modification operators to work we need to find meaningful replacements. ◦ Replacements should be functionally similar – behave the same as the replaced word ◦ We need a knowledge-base where we can find such words. ◦ Options for this include: word2vec, WordNet and more. ◦ We chose “Dependency Word2vec” (Levy & Goldberg et al. 2014) as our knowledge base Dependency w2v – ACL 2014
  • 28. Qualitative analysis of the knowledge- bases Dependency Word2vec (Levy & Goldberg 2014) Introduces a subtle change in the word2vec context: Australian scientist discovers star withtelescope Australian scientist discoversstar with telescope prep_withnsubj dobj Word2vec: Dependency word2vec:
  • 29. Qualitative analysis of the knowledge- bases Dependency Word2vec (Levy & Goldberg 2014) Functional similarity is exhibited very well in dep w2v. w2v dumbledore hallows half-blood malfoy snape Dep w2v sunnydale collinwood calarts greendale millfield hogwarts Related to Harry Potter Schools
  • 30. Full example of modification operators Batman is really awesomeBatman is really awesome Batman superman superboy supergirl catwoman aquaman awesome terrific marvelous wonderful lousy awful Further analysis of 4 different knowledge-bases can be seen in the full paper.
  • 31. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 32. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦ Definition of new textual modification operators ◦Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 33. Stochastic Synthesis Algorithm ◦ A simple way to use the operators is just applying them randomly. ◦ Until enough instances have been generated do: 1. Randomly choose an instance from the available examples 2. Apply a random operators to it 3. Return as new MQ Examples 𝜙1 𝜙1 1 𝜙1 2 𝜙1 3𝜙1 3 𝜙2 𝜙2 1 𝜙2 2 𝜙2 3 𝜙2 3 New examples
  • 34. Using search algorithms to generate examples ◦ Repeatedly applying these operators gives us many options ◦ Using search algorithms we can actively look for the most informative examples ◦ But how do we direct the search ? 𝜙1 𝜙1 1 𝜙1 2 𝜙1 3 = 𝜙2 𝜙2 1 = 𝜙3 𝜙2 2 𝜙2 3 𝜙3 1 𝜙3 2 𝜙3 3
  • 35. Active learning - throwback
  • 36. Active learning - throwback
  • 37. Search Heuristic Function ◦ To direct the search we need a function that gives higher score to more informative instances. ◦ We used existing active learning functions to give higher score to more informative examples: ◦ Uncertainty sampling (Lewis & Gale, 1994) ◦ Expected model change (Lindenbaum, Markovitch, & Rusakov, 2004)
  • 38. Heuristic-Search Generation ◦ Similar to the stochastic approach, but apply a search alg’ to pick the best example ◦ Until enough instances have been generated do: 1. Randomly choose an instance from the available examples 2. Run a heuristic search on that instance 3. Return as a new example Examples 𝜙3 1 𝜙3 2 𝜙3 3 New examples 𝜙2 1 = 𝜙3 𝜙2 2 𝜙2 3 𝜙1 𝜙1 1 𝜙1 2 𝜙1 3 = 𝜙2 𝜙2 3 Uncertainty sampling directs the search
  • 39. Heuristic-Search Generation ◦ Similar to the stochastic approach, but apply a search alg’ to pick the best example ◦ Until enough instances have been generated do: 1. Randomly choose an instance from the available examples 2. Run a heuristic search on that instance 3. Return as a new example Examples 𝜙′3 1 𝜙′3 2 𝜙′3 3 New examples 𝜙′2 1 = 𝜙′3 𝜙′2 2 𝜙′2 3 𝜙′1 𝜙′1 1 𝜙′1 2 𝜙′1 3 = 𝜙′2 𝜙′3 3 𝜙2 3 Uncertainty sampling directs the search
  • 40. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦ Definition of new textual modification operators ◦Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 41. Overview ◦ Potential problems of traditional data augmentations in text ◦ Quick overview of active-learning ◦ Definition of new textual modification operators ◦ Applying heuristic-search with modification operators for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 42. Sentence Quality – Human Evaluation ◦ We already talked about how generating a readable sentence can be hard, are these operators comply with that ? ◦ We randomly chose 1000 sentences from each category, and asked “Is this sentence fully readable to you ?” Original Sentences (96%) Yes No HS Sentences (95%) Yes No Wikipedia LSTM Sentences (21%) Yes No
  • 43. Labeled Training Starting with 10 examples Experiment 1 – Batch Active-Learning
  • 44. Unlabeled pool Inference Labeling Experiment 1 – Batch Active-Learning If uncertain, Labeled Starting with 10 examples Training
  • 45. Generated pool Inference Labeling Experiment 1 – Batch Active-Learning If uncertain, Labeled Starting with 10 examples Training Augment Without any unlabeled data !
  • 46. Datasets ◦ Sentiment Analysis: ◦ CMR: Cornell sentiment polarity dataset ◦ SST: Stanford sentiment treebank, a sentence sentiment analysis dataset ◦ KS: A Kaggle short sentence sentiment analysis dataset ◦ Subjectivity/Objectivity Detection ◦ SUBJ: Cornell sentence subjective / objective dataset ◦ Offensive-language and Hate-speech Detection: ◦ HS: Hate speech and offensive language classification dataset
  • 47. Compared Methods ◦ Our methods: ◦ Uncertainty sampling Hill-climbing MQ synthesis (US-HC-MQ) ◦ Uncertainty sampling Beam-search MQ synthesis (US-BS-MQ) ◦ Stochastic Synthesis (S-MQ) ◦ Competitor Methods: ◦ WordNet-based Synonym-replacement (WNA) (Lecun et al. 2016) ◦ Original examples (IDEAL) ◦ LSTM Generator (RNN) – pretrained on English Wikipedia Uses unlabeled data
  • 48. Results – Experiment 1 • We can see that our methods have consistently improved the initial accuracy • The search-based method are superior among almost all datasets
  • 49. Labeled Training Starting with 10 examples Experiment 2 - Measuring Label Switch
  • 50. Generated pool Inference Labeled Starting with 10 examples Training Augment Experiment 2 - Measuring Label Switch Choose 50 most informative How many switched their original label ?
  • 51. ◦ We compare 3 synthesis algorithms: 1. Uncertainty hill-climbing (search-based generation) 2. Stochastic hill-climbing (multiple random operators) 3. Stochastic Synthesis Experiment 2 - Measuring Label Switch
  • 52. % of switched instances
  • 53. What did we see ? ◦ Potential problems of non-restrictive augmentations in text ◦ Definition of new modification operators (= non-label preserving augmentations) in the textual domain. ◦ Using heuristic-search with modification operators for generating new examples for active-learning. ◦ Empirical evaluation of this method on several datasets
  • 54. I want to thank you for coming ! I want to thank you for arriving ! I would-like to thank you for coming ! I want to condemn you for coming ! I want to thank you for going ! Thanks-for-coming sentence I want to thank you for coming ! Thank You ! Questions ?

Editor's Notes

  1. Hello everyone, My name is Jonathan and today I’ll take you through a tour of non-label-preserving augmentations, what does label-preserving mean and why we might want not to do that, their pros and cons, and my personal work (to appear in IJCAI20) on defining such operators in the textual domain **
  2. Now lets get started
  3. Traditional=מסורתי
  4. But we’ll never see augmentations that change this butterfly image to…
  5. Let’s get back to text
  6. Punchline, augmentation in NLP is not trivial like in CV So the only real “augmentation” from EDA is the synonym replacement.
  7. Even easier to make unreadable sentences
  8. Guaranteed for those who click on that link
  9. If we start with The label is totally different ! Next slide is the summary
  10. Main motivation slide. Iron out message
  11. Traditional=מסורתי
  12. Traditional=מסורתי
  13. Let’s do a quick overview of AL, as we’ll revisit the subject during the rest of the talk
  14. That’s it for the introduction
  15. The experiment was repeated 20 times for statistical significance.
  16. The experiment was repeated 20 times for statistical significance.
  17. Traditional=מסורתי
  18. Traditional=מסורתי
  19. Semantic rules = ‘make sense’ So how did we build the operators in the instance space. For these reasons we have to make sure that our operators will keep the resulting sentences are legal English sentences.
  20. What’s a knowledge-base ? We’ll get to that later
  21. In order to find
  22. Dep w2v was designed to exhibit these two properties and this examples shows this well. Where w2v returned words with related meanings, dep w2v returned other scientists. This property is repeated for many cases, more details can be found in the original dep w2v paper (very recommended)
  23. Dep w2v was designed to exhibit these two properties and this examples shows this well. Where w2v returned words with related meanings, dep w2v returned other scientists. This property is repeated for many cases, more details can be found in the original dep w2v paper (very recommended)
  24. In word2vec for example we will get topically related words such as “dc comics” for batman Next slide is search algs’
  25. Traditional=מסורתי
  26. Traditional=מסורתי
  27. Now lets look at how its done
  28. I left notes to relevant papers in the
  29. Next slide is empirical evaluation
  30. Traditional=מסורתי
  31. Traditional=מסורתי
  32. RNN Wasn’t able to generate proper sentences with only 10 training instances. ”I am movie .” ”x this film .”
  33. Original AL setup
  34. The experiment was repeated 20 times for statistical significance.
  35. We test our framework on 5 datasets: 3 sentiment analysis datasets, one subjectivity/objectivity dataset and one hate-speech and offensive language detection Objective the movie begins in the past where a young boy named sam . . . (attempts to save celebi from a hunter . ) Subjective I really liked the movie it was a-lot of fun
  36. We compared 2 search-based methods, one using hill-climbing as search alg’ and another with beam-search. Both algs used uncertainty sampling as its heuristic As there are no other works that perform textual MQs, so we chose 3 competitors that do similar augmentation/generation to compare with. First we used an ‘upper-limit’ method we call “IDEAL” this method uses other original examples and from the pool the most informative examples
  37. Show with pointer what I’m talking about As we can see the search based-methods (blue, red) are superior in almost all datasets. Another interesting point we can see in the graphs is the squeezing of information I talked about in the example. We can see the initially there is a-lot of information to extract from the existing examples and we have high accuracy gains. But after adding a few examples, most information was already extracted and the plot stops rising (mitkanes). This is a good sign showing that our initial intuition still holds in this case.
  38. The experiment was repeated 20 times for statistical significance.
  39. We can see a clear hierarchy here between uncertainty HC, random HC and S-MQ This reinforces our hypothesis that using the more sophisticated approaches result in better instances. At the very least it results in more label changes.
  40. The idea is very general, and might be useful in other domains where even unlabeled data is scarce.
  41. רציתי להודות לכל מי שהגיע היום על התמיכה במהלך הדרך הלא-פשוטה הזאת ובמיוחד רציתי להודות לפרופסור שלי שאול מרקוביץ שנתן לי חופש פעולה מדהים בעבודה הזו, תמך בי במשך כל הדרך ואפילו נתן לי לנסוע לדרום אמריקה ל3 חודשים במהלך ההשתלמות, דבר לא נפוץ בעליל כאן בפקולטה. אז תודה לכולם ולשאול.