SlideShare a Scribd company logo
2023 EMNLP Day
San Kim
2023.01.10
Dictionary-Assisted Supervised Contrastive Learning
Dictionary-assisted supervised contrastive learning(DASCL)
• A way to leverage specialized dictionaries when finetuning pretrained language models
Dictionary Examples
• Opinion Lexicon
• Lexicoder Sentiment Dictionary
Findings
• Finetuning using the DASCL objective combined with cross-entropy improves classification
performance metrics across various applications
• Comparisons to cross-entropy alone and Supervised contrastive learning(SCL) – Gunel et al.
2021
• Find greatest improvements in few-shot learning settings
• Also find improvements when using the entire training set
Dictionary-Assisted Supervised Contrastive Learning
Keyword Simplification
Dictionary-Assisted Supervised Contrastive Learning
Keyword Simplification
Dictionary-Assisted Supervised Contrastive Learning
Contrastive Learning
• The two RoBERTa networks share the same weights.
• The dimension of the projection layer is arbitrary.
Dictionary-Assisted Supervised Contrastive Learning
Dictionary-Assisted Supervised Contrastive Learning
AbuseEval, using OLID(Offensive Language Identification Dataset)-OffensEval(SemEval 2019 shared
task)
• Explicit (abuse)
• #ThursdayThoughts- FUCK liberals. Forever.
• @USER @USER SHE IS A FUCKING MESS!! I HATE HER SO MUCH
• @USER Gotohell ! This is NOT Queen for a Day. I believe you less and less with every bit of
bullsh*t you pull. You’re nothing but a lying Demonrat! #MAGA #Trump2020
• Implicit (abuse)
• 4 out of 10 British people are basically full-on racists. 4 out of 10 voters vote for the
Conservatives. Coincidence!???!???
• @USER@USER Oh you are in England? Your views on gun control stopped mattering in 1776.
• @USER@USER Wonder how many children he molested
• @USER Isn’t the coalition for gun control headed up by the lady who was turned down for a
job because she was a bully?
• Not (abusive)
• @USER I miss you bitch!!
• @USER Nigga we’re going next week
Dictionary-Assisted Supervised Contrastive Learning
Interpreting Language Models with Contrastive Explanations (BP session)
Common Approch
• Why did the LM predict [something]?
• Why did the LM predict “barking”?
• Input: Can you stop the dog from
• Output: barking
Contrastive explanations are more intuitive
• Why did the LM predict [target] instead of [foil]?
• Why did the LM predict “barking” instead of “crying”?
• Input: Can you stop the dog from
• Output: barking
Interpreting Language Models with Contrastive Explanations (BP session)
Contrastive explanations for language models
Interpreting Language Models with Contrastive Explanations (BP session)
• Gradient Norm
• Contrastive Gradient Norm
• Gradient X Input
• Contrastive Gradient X Input
• Input Erasure
• Contrastive Input Erasure
Interpreting Language Models with Contrastive Explanations (BP session)
• BLiMP
Interpreting Language Models with Contrastive Explanations (BP session)
BLiMP
• ANAPHOR AGREEMENT: the requirement that reflexive pronouns like himself (a.k.a. anaphora)
agree with their antecedents in person, number, gender, and animacy.
• ARGUMENT STRUCTURE: the ability of different verbs to appear with different types of arguments.
For instance, different verbs can appear with a direct object, participate in the causative alternation,
or take an inanimate argument.
• BINDING: the structural relationship between a pronoun and its antecedent. All paradigms
illustrate aspects of Chomsky’s (1981) Principle A. Because coindexation cannot be annotated in
BLiMP, Principles B and C are not illustrated.
• CONTROL/RAISING: syntactic and semantic differences between various types of predicates that
embed an infinitival VP. This includes control, raising, and toughmovement predicates.
• DETERMINER-NOUN AGREEMENT: number agreement between demonstrative determiners (e.g.,
this/these) and the associated noun.
• ELLIPSIS: the possibility of omitting expressions from a sentence. Because this is difficult to
illustrate with sentences of equal length, our paradigms cover only special cases of noun phrase
ellipsis that meet this constraint.
Interpreting Language Models with Contrastive Explanations (BP session)
BLiMP
• FILLER-GAP: dependencies arising from phrasal movement in, for example, whquestions.
• IRREGULAR FORMS: irregular morphology on English past participles (e.g., broken). We are unable
to evaluate models on nonexistent forms like *breaked because such forms are out of the
vocabulary for some LMs.
• ISLAND EFFECTS: restrictions on syntactic environments where the gap in a filler-gap dependency
may occur.
• NPILICENSING: restrictions on the distribution of negative polarity items like any and ever limited
to, for example, the scope of negation and only.
• QUANTIFIERS: restrictions on the distribution of quantifiers. We cover two such restrictions:
superlative quantifiers (e.g., at least) cannot embed under negation, and definite quantifiers and
determiners cannot be subjects in existential-there constructions.
• SUBJECT-VERB AGREEMENT: subjects and present tense verbs must agree in number.
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
RULE
• Anaphor Agreement: The gender and the number of a pronoun must agree with its antecedent.
We implement the ‘coref’ rule using spaCy and NeuralCoref to extract all input tokens that are
coreferent with the target token
• Argument Structure: Certain arguments can only appear with certain types of verbs. For example,
action verbs must often be used with animate objects. We implement the ‘main_verb’ rule using
spaCy to extract the main verb of the input sentence.
• Determiner-Noun Agreement: Demonstrative determiners and the associated noun must agree. We
implement the ‘det_noun’ rule by generating the dependency tree using spaCy and extracting the
determiner of the target noun.
Interpreting Language Models with Contrastive Explanations (BP session)
RULE
• NPI Licensing: Certain negative polarity items (NPI) are only allowed to appear in certain contexts,
e.g. “never” appears on its own in sentences, while the word “ever” generally must be preceded by
“no”. In all of our examples with NPI licensing, the word “even” is an NPI that can appear in the
acceptable example but not in the unacceptable example, so we create the npi rule that extracts
this NPI.
• Subject-Verb Agreement: The number of the subject and its verb in the present tense must agree.
We implement the ‘subj_verb’ rule by generating the dependency tree using spaCy and extracting
the subject of the target verb.
Interpreting Language Models with Contrastive Explanations (BP session)
Alignment Metrics
• Probes Needed
• We measure the number of tokens we need to probe, based on the explanation, to find a
token that is in the know evidence.
• Mean Reciprocal Rank (MRR)
• We calculate the average of the inverse of the rank of the first token that is part of the known
evidence if the tokens are sorted in descending saliency.
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)
Interpreting Language Models with Contrastive Explanations (BP session)

More Related Content

Similar to 2023 EMNLP day_san.pptx

Syntax.ppt
Syntax.pptSyntax.ppt
Syntax.ppt
KhenAguinillo
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
Hemantha Kulathilake
 
Syntactic Structures
Syntactic StructuresSyntactic Structures
Syntactic Structures
Francisco Cerna
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
Seid Hassen
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
Hemantha Kulathilake
 
Syntax
SyntaxSyntax
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
RajpootBhatti5
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmmUnit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
DhruvKushwaha12
 
SSSLW 2017
SSSLW 2017SSSLW 2017
SSSLW 2017
Nobuhiro Kamiya
 
ppt sfl kel 5.pptx
ppt sfl kel 5.pptxppt sfl kel 5.pptx
ppt sfl kel 5.pptx
Bima811001
 
The CLUES database: automated search for linguistic cognates
The CLUES database: automated search for linguistic cognatesThe CLUES database: automated search for linguistic cognates
The CLUES database: automated search for linguistic cognates
Mark Planigale
 
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
ssuserc35c0e
 
AINL 2016: Eyecioglu
AINL 2016: EyeciogluAINL 2016: Eyecioglu
AINL 2016: Eyecioglu
Lidia Pivovarova
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
siddhantroy13
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
LaraOlmosCamarena
 

Similar to 2023 EMNLP day_san.pptx (20)

Syntax.ppt
Syntax.pptSyntax.ppt
Syntax.ppt
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
Syntactic Structures
Syntactic StructuresSyntactic Structures
Syntactic Structures
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Amharic WSD using WordNet
Amharic WSD using WordNetAmharic WSD using WordNet
Amharic WSD using WordNet
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
 
Syntax
SyntaxSyntax
Syntax
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmmUnit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
Unit-1 PPL PPTvvhvmmmmmmmmmmmmmmmmmmmmmm
 
SSSLW 2017
SSSLW 2017SSSLW 2017
SSSLW 2017
 
ppt sfl kel 5.pptx
ppt sfl kel 5.pptxppt sfl kel 5.pptx
ppt sfl kel 5.pptx
 
The CLUES database: automated search for linguistic cognates
The CLUES database: automated search for linguistic cognatesThe CLUES database: automated search for linguistic cognates
The CLUES database: automated search for linguistic cognates
 
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
 
AINL 2016: Eyecioglu
AINL 2016: EyeciogluAINL 2016: Eyecioglu
AINL 2016: Eyecioglu
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
 

More from San Kim

20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
San Kim
 
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxLongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
San Kim
 
slide-acl2022-combined_san.pptx
slide-acl2022-combined_san.pptxslide-acl2022-combined_san.pptx
slide-acl2022-combined_san.pptx
San Kim
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
San Kim
 
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
San Kim
 
AI2 day.pptx
AI2 day.pptxAI2 day.pptx
AI2 day.pptx
San Kim
 
Temporal reasoning task
Temporal reasoning taskTemporal reasoning task
Temporal reasoning task
San Kim
 
Answering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrievalAnswering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrieval
San Kim
 
Measuring massive multitask language understanding
Measuring massive multitask language understandingMeasuring massive multitask language understanding
Measuring massive multitask language understanding
San Kim
 
Abductive commonsense reasoning
Abductive commonsense reasoningAbductive commonsense reasoning
Abductive commonsense reasoning
San Kim
 
Electra
ElectraElectra
Electra
San Kim
 
XLnet RoBERTa Reformer
XLnet RoBERTa ReformerXLnet RoBERTa Reformer
XLnet RoBERTa Reformer
San Kim
 
Transformer xl
Transformer xlTransformer xl
Transformer xl
San Kim
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
San Kim
 
Gan seminar
Gan seminarGan seminar
Gan seminar
San Kim
 
Deep learning study 3
Deep learning study 3Deep learning study 3
Deep learning study 3
San Kim
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
San Kim
 
Deep learning study 1
Deep learning study 1Deep learning study 1
Deep learning study 1
San Kim
 
Back propagation
Back propagationBack propagation
Back propagation
San Kim
 

More from San Kim (19)

20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...
 
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxLongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptx
 
slide-acl2022-combined_san.pptx
slide-acl2022-combined_san.pptxslide-acl2022-combined_san.pptx
slide-acl2022-combined_san.pptx
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
 
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...
 
AI2 day.pptx
AI2 day.pptxAI2 day.pptx
AI2 day.pptx
 
Temporal reasoning task
Temporal reasoning taskTemporal reasoning task
Temporal reasoning task
 
Answering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrievalAnswering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrieval
 
Measuring massive multitask language understanding
Measuring massive multitask language understandingMeasuring massive multitask language understanding
Measuring massive multitask language understanding
 
Abductive commonsense reasoning
Abductive commonsense reasoningAbductive commonsense reasoning
Abductive commonsense reasoning
 
Electra
ElectraElectra
Electra
 
XLnet RoBERTa Reformer
XLnet RoBERTa ReformerXLnet RoBERTa Reformer
XLnet RoBERTa Reformer
 
Transformer xl
Transformer xlTransformer xl
Transformer xl
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 
Deep learning study 3
Deep learning study 3Deep learning study 3
Deep learning study 3
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
Deep learning study 1
Deep learning study 1Deep learning study 1
Deep learning study 1
 
Back propagation
Back propagationBack propagation
Back propagation
 

Recently uploaded

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 

2023 EMNLP day_san.pptx

  • 1. 2023 EMNLP Day San Kim 2023.01.10
  • 2. Dictionary-Assisted Supervised Contrastive Learning Dictionary-assisted supervised contrastive learning(DASCL) • A way to leverage specialized dictionaries when finetuning pretrained language models Dictionary Examples • Opinion Lexicon • Lexicoder Sentiment Dictionary Findings • Finetuning using the DASCL objective combined with cross-entropy improves classification performance metrics across various applications • Comparisons to cross-entropy alone and Supervised contrastive learning(SCL) – Gunel et al. 2021 • Find greatest improvements in few-shot learning settings • Also find improvements when using the entire training set
  • 3. Dictionary-Assisted Supervised Contrastive Learning Keyword Simplification
  • 4. Dictionary-Assisted Supervised Contrastive Learning Keyword Simplification
  • 5. Dictionary-Assisted Supervised Contrastive Learning Contrastive Learning • The two RoBERTa networks share the same weights. • The dimension of the projection layer is arbitrary.
  • 7. Dictionary-Assisted Supervised Contrastive Learning AbuseEval, using OLID(Offensive Language Identification Dataset)-OffensEval(SemEval 2019 shared task) • Explicit (abuse) • #ThursdayThoughts- FUCK liberals. Forever. • @USER @USER SHE IS A FUCKING MESS!! I HATE HER SO MUCH • @USER Gotohell ! This is NOT Queen for a Day. I believe you less and less with every bit of bullsh*t you pull. You’re nothing but a lying Demonrat! #MAGA #Trump2020 • Implicit (abuse) • 4 out of 10 British people are basically full-on racists. 4 out of 10 voters vote for the Conservatives. Coincidence!???!??? • @USER@USER Oh you are in England? Your views on gun control stopped mattering in 1776. • @USER@USER Wonder how many children he molested • @USER Isn’t the coalition for gun control headed up by the lady who was turned down for a job because she was a bully? • Not (abusive) • @USER I miss you bitch!! • @USER Nigga we’re going next week
  • 9. Interpreting Language Models with Contrastive Explanations (BP session) Common Approch • Why did the LM predict [something]? • Why did the LM predict “barking”? • Input: Can you stop the dog from • Output: barking Contrastive explanations are more intuitive • Why did the LM predict [target] instead of [foil]? • Why did the LM predict “barking” instead of “crying”? • Input: Can you stop the dog from • Output: barking
  • 10. Interpreting Language Models with Contrastive Explanations (BP session) Contrastive explanations for language models
  • 11. Interpreting Language Models with Contrastive Explanations (BP session) • Gradient Norm • Contrastive Gradient Norm • Gradient X Input • Contrastive Gradient X Input • Input Erasure • Contrastive Input Erasure
  • 12. Interpreting Language Models with Contrastive Explanations (BP session) • BLiMP
  • 13. Interpreting Language Models with Contrastive Explanations (BP session) BLiMP • ANAPHOR AGREEMENT: the requirement that reflexive pronouns like himself (a.k.a. anaphora) agree with their antecedents in person, number, gender, and animacy. • ARGUMENT STRUCTURE: the ability of different verbs to appear with different types of arguments. For instance, different verbs can appear with a direct object, participate in the causative alternation, or take an inanimate argument. • BINDING: the structural relationship between a pronoun and its antecedent. All paradigms illustrate aspects of Chomsky’s (1981) Principle A. Because coindexation cannot be annotated in BLiMP, Principles B and C are not illustrated. • CONTROL/RAISING: syntactic and semantic differences between various types of predicates that embed an infinitival VP. This includes control, raising, and toughmovement predicates. • DETERMINER-NOUN AGREEMENT: number agreement between demonstrative determiners (e.g., this/these) and the associated noun. • ELLIPSIS: the possibility of omitting expressions from a sentence. Because this is difficult to illustrate with sentences of equal length, our paradigms cover only special cases of noun phrase ellipsis that meet this constraint.
  • 14. Interpreting Language Models with Contrastive Explanations (BP session) BLiMP • FILLER-GAP: dependencies arising from phrasal movement in, for example, whquestions. • IRREGULAR FORMS: irregular morphology on English past participles (e.g., broken). We are unable to evaluate models on nonexistent forms like *breaked because such forms are out of the vocabulary for some LMs. • ISLAND EFFECTS: restrictions on syntactic environments where the gap in a filler-gap dependency may occur. • NPILICENSING: restrictions on the distribution of negative polarity items like any and ever limited to, for example, the scope of negation and only. • QUANTIFIERS: restrictions on the distribution of quantifiers. We cover two such restrictions: superlative quantifiers (e.g., at least) cannot embed under negation, and definite quantifiers and determiners cannot be subjects in existential-there constructions. • SUBJECT-VERB AGREEMENT: subjects and present tense verbs must agree in number.
  • 15. Interpreting Language Models with Contrastive Explanations (BP session)
  • 16. Interpreting Language Models with Contrastive Explanations (BP session) RULE • Anaphor Agreement: The gender and the number of a pronoun must agree with its antecedent. We implement the ‘coref’ rule using spaCy and NeuralCoref to extract all input tokens that are coreferent with the target token • Argument Structure: Certain arguments can only appear with certain types of verbs. For example, action verbs must often be used with animate objects. We implement the ‘main_verb’ rule using spaCy to extract the main verb of the input sentence. • Determiner-Noun Agreement: Demonstrative determiners and the associated noun must agree. We implement the ‘det_noun’ rule by generating the dependency tree using spaCy and extracting the determiner of the target noun.
  • 17. Interpreting Language Models with Contrastive Explanations (BP session) RULE • NPI Licensing: Certain negative polarity items (NPI) are only allowed to appear in certain contexts, e.g. “never” appears on its own in sentences, while the word “ever” generally must be preceded by “no”. In all of our examples with NPI licensing, the word “even” is an NPI that can appear in the acceptable example but not in the unacceptable example, so we create the npi rule that extracts this NPI. • Subject-Verb Agreement: The number of the subject and its verb in the present tense must agree. We implement the ‘subj_verb’ rule by generating the dependency tree using spaCy and extracting the subject of the target verb.
  • 18. Interpreting Language Models with Contrastive Explanations (BP session) Alignment Metrics • Probes Needed • We measure the number of tokens we need to probe, based on the explanation, to find a token that is in the know evidence. • Mean Reciprocal Rank (MRR) • We calculate the average of the inverse of the rank of the first token that is part of the known evidence if the tokens are sorted in descending saliency.
  • 19. Interpreting Language Models with Contrastive Explanations (BP session)
  • 20. Interpreting Language Models with Contrastive Explanations (BP session)
  • 21. Interpreting Language Models with Contrastive Explanations (BP session)
  • 22. Interpreting Language Models with Contrastive Explanations (BP session)
  • 23. Interpreting Language Models with Contrastive Explanations (BP session)
  • 24. Interpreting Language Models with Contrastive Explanations (BP session)
  • 25. Interpreting Language Models with Contrastive Explanations (BP session)
  • 26. Interpreting Language Models with Contrastive Explanations (BP session)
  • 27. Interpreting Language Models with Contrastive Explanations (BP session)
  • 28. Interpreting Language Models with Contrastive Explanations (BP session)
  • 29. Interpreting Language Models with Contrastive Explanations (BP session)
  • 30. Interpreting Language Models with Contrastive Explanations (BP session)