SlideShare a Scribd company logo
1 of 1
Download to read offline
Who’s to say what’s funny?
A computer using Language Models and Deep Learning,
That’s Who!
Xinru Yan & Ted Pedersen
{yanxx418,tpederse}@d.umn.edu
Department of Computer Science University of Minnesota Duluth
The Problem
• Traditional humor detection: binary classification
Our focus: learn a continuous and subjective sense of
humor from tweets submitted to the @midnight show in
response to hashtags by using Ngram Language
Models (LMs) and Deep Learning (DL) methods
• We participated in SemEval-2017 Task 6
#HashtagWars: Learning a Sense of Humor [1]:
• Tweets are in three baskets: top most funny tweet, next
nine most funny tweets and all remaining
• Two Subtasks
• A: Pairwise Comparison – a system should choose a
funnier of two tweets given a hashtag file
• B: Semi-Ranking – a system should categorize tweets
into the right baskets given a hashtag file
• Dataset
• Tweet Data: provided by the task, 106 hashtag files,
about 21,580 tokens
• News Data: We used 6.2 GB English news, about 2
million tokens [2]
Examples from #BreakUpIn5Words (Trigram LM, News Data)
Tweet @midnight LM DL
It’s not you, it’s meth. funniest funny
?
Hey, can we NOT talk? funny funny
You need your own Netflix funny not funny
Figured I’d try being happy. not funny funny
You’re a Mac, I’m PC not funny not funny
Language Models
Ngram LMs learn humor from training data and allow rank-
ing by assigning probability for each statement [3][4]
Language Model Results
We seek high accuracy for A and low distance for B.
Dataset Ngram Accuracy (A) Distance (B)
news 3 0.627 (4th) 0.872 (1st)
news 2 0.624 0.853
tweet 3 0.397 (8th) 0.967 (8th)
tweet 2 0.406 0.944
• The type and the quantity of the corpora is what
really matters –> more tweet data, less news data
• Bigram LMs performed slightly better than trigram LMs
–> Unigram and character level LMs
Deep Learning
• Humor relies on creative use of language which causes
too many OOV
• Jokes often include puns based on invented words
Barktender #DogJobs
Tinderella #UpdateAFairyTale
• Token-level LMs can not understand such puns
• Character-based CNNs (CharCNN) are not dependent
on observing tokens in training data
• Bigram and trigram LMs only use two or three preceding
words to predict the next word –> LSTMs are good at
making use of sequantial data such as text and are
designed for long-term dependencies
• Some hashtags require tweets to have more than three
words and some funny tweets are mostly made up of
common bigrams or trigrams
Complaining makes it better #AmericaIn4Words
Romantic dinners with the cats #BestWeekendIn5Words
• Ngram LMs do not include external knowledge such as
movie titles and song lyrics –> Create word embeddings
from domain specific materials
• Our plan: use Keras library to train CharCNN + LSTM
LMs on both datasets and investigate ways to include
domain knowledge word embeddings in the CharCNN +
LSTM LM
References
[1] Peter Potash, Alexey Romanov, and Anna Rumshisky.
SemEval-2017 Task 6: #HashtagWars: learning a sense of humor.
In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, BC, August 2017.
[2] EMNLP 2011 SIXTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION.
http://www.statmt.org/wmt11/translation-task.html.
[3] Kenneth Heafield, Ivan Pouzyrevsky, Jonathan H. Clark, and Philipp Koehn.
Scalable modified Kneser-Ney language model estimation.
In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, August 2013.
[4] Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, and Lillian Lee.
You had me at hello: How phrasing affects memorability.
In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 2012.

More Related Content

Similar to Who's to say what's funny? A computer using Language Models and Deep Learning, That's Who!

Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection University of Minnesota, Duluth
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine ReadingIsabelle Augenstein
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...Challenging Reading Comprehension on Daily Conversation: Passage Completion o...
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...Jinho Choi
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...hajinouha0
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?George Sam
 
Classifying Non-Referential It for Question Answer Pairs
Classifying Non-Referential It for Question Answer PairsClassifying Non-Referential It for Question Answer Pairs
Classifying Non-Referential It for Question Answer PairsJinho Choi
 
April 10th of 2018 budapest presentation
April 10th of 2018 budapest presentationApril 10th of 2018 budapest presentation
April 10th of 2018 budapest presentationAhmet Bulut
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...Savvas Zannettou
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis csandit
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwittersLiu Chang
 

Similar to Who's to say what's funny? A computer using Language Models and Deep Learning, That's Who! (20)

Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
Duluth at Semeval 2017 Task 6 - Language Models in Humor Detection
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
 
ashu ppt final.pptx
ashu ppt final.pptxashu ppt final.pptx
ashu ppt final.pptx
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...Challenging Reading Comprehension on Daily Conversation: Passage Completion o...
Challenging Reading Comprehension on Daily Conversation: Passage Completion o...
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
Sld-Natural-Language-Processing-for-large-volumes-of-human-text-data-Sozzi-Br...
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Classifying Non-Referential It for Question Answer Pairs
Classifying Non-Referential It for Question Answer PairsClassifying Non-Referential It for Question Answer Pairs
Classifying Non-Referential It for Question Answer Pairs
 
April 10th of 2018 budapest presentation
April 10th of 2018 budapest presentationApril 10th of 2018 budapest presentation
April 10th of 2018 budapest presentation
 
Abstract
AbstractAbstract
Abstract
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...
On the Origins of Memes by Means of Fringe Web Communities - Invited talk ta ...
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Aaai 1
Aaai 1Aaai 1
Aaai 1
 
Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis Explore the Effects of Emoticons on Twitter Sentiment Analysis
Explore the Effects of Emoticons on Twitter Sentiment Analysis
 
TextMiningTwitters
TextMiningTwittersTextMiningTwitters
TextMiningTwitters
 

More from University of Minnesota, Duluth

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...University of Minnesota, Duluth
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? University of Minnesota, Duluth
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?University of Minnesota, Duluth
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...University of Minnesota, Duluth
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyUniversity of Minnesota, Duluth
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...University of Minnesota, Duluth
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyUniversity of Minnesota, Duluth
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...University of Minnesota, Duluth
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)University of Minnesota, Duluth
 

More from University of Minnesota, Duluth (20)

Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
Muslims in Machine Learning workshop (NeurlPS 2021) - Automatically Identifyi...
 
Automatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social MediaAutomatically Identifying Islamophobia in Social Media
Automatically Identifying Islamophobia in Social Media
 
What Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshopWhat Makes Hate Speech : an interactive workshop
What Makes Hate Speech : an interactive workshop
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
 
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?Algorithmic Bias : What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?
 
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
Duluth at Semeval 2017 Task 7 - Puns upon a Midnight Dreary, Lexical Semantic...
 
Puns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and wearyPuns upon a midnight dreary, lexical semantics for the weak and weary
Puns upon a midnight dreary, lexical semantics for the weak and weary
 
The horizon isn't found in a dictionary : Identifying emerging word senses a...
The horizon isn't found in a  dictionary : Identifying emerging word senses a...The horizon isn't found in a  dictionary : Identifying emerging word senses a...
The horizon isn't found in a dictionary : Identifying emerging word senses a...
 
Screening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSDScreening Twitter Users for Depression and PTSD
Screening Twitter Users for Depression and PTSD
 
Duluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of LexicographyDuluth : Word Sense Discrimination in the Service of Lexicography
Duluth : Word Sense Discrimination in the Service of Lexicography
 
Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014Pedersen masters-thesis-oct-10-2014
Pedersen masters-thesis-oct-10-2014
 
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
MICAI 2013 Tutorial Slides - Measuring the Similarity and Relatedness of Conc...
 
What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)What it's like to do a Master's thesis with me (Ted Pedersen)
What it's like to do a Master's thesis with me (Ted Pedersen)
 
Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25Pedersen naacl-2013-demo-poster-may25
Pedersen naacl-2013-demo-poster-may25
 
Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24Pedersen semeval-2013-poster-may24
Pedersen semeval-2013-poster-may24
 
Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013Talk at UAB, April 12, 2013
Talk at UAB, April 12, 2013
 
Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012Feb20 mayo-webinar-21feb2012
Feb20 mayo-webinar-21feb2012
 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
 
Pedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshopPedersen ACL Disco-2011 workshop
Pedersen ACL Disco-2011 workshop
 
Pedersen acl2011-business-meeting
Pedersen acl2011-business-meetingPedersen acl2011-business-meeting
Pedersen acl2011-business-meeting
 

Recently uploaded

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Recently uploaded (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Who's to say what's funny? A computer using Language Models and Deep Learning, That's Who!

  • 1. Who’s to say what’s funny? A computer using Language Models and Deep Learning, That’s Who! Xinru Yan & Ted Pedersen {yanxx418,tpederse}@d.umn.edu Department of Computer Science University of Minnesota Duluth The Problem • Traditional humor detection: binary classification Our focus: learn a continuous and subjective sense of humor from tweets submitted to the @midnight show in response to hashtags by using Ngram Language Models (LMs) and Deep Learning (DL) methods • We participated in SemEval-2017 Task 6 #HashtagWars: Learning a Sense of Humor [1]: • Tweets are in three baskets: top most funny tweet, next nine most funny tweets and all remaining • Two Subtasks • A: Pairwise Comparison – a system should choose a funnier of two tweets given a hashtag file • B: Semi-Ranking – a system should categorize tweets into the right baskets given a hashtag file • Dataset • Tweet Data: provided by the task, 106 hashtag files, about 21,580 tokens • News Data: We used 6.2 GB English news, about 2 million tokens [2] Examples from #BreakUpIn5Words (Trigram LM, News Data) Tweet @midnight LM DL It’s not you, it’s meth. funniest funny ? Hey, can we NOT talk? funny funny You need your own Netflix funny not funny Figured I’d try being happy. not funny funny You’re a Mac, I’m PC not funny not funny Language Models Ngram LMs learn humor from training data and allow rank- ing by assigning probability for each statement [3][4] Language Model Results We seek high accuracy for A and low distance for B. Dataset Ngram Accuracy (A) Distance (B) news 3 0.627 (4th) 0.872 (1st) news 2 0.624 0.853 tweet 3 0.397 (8th) 0.967 (8th) tweet 2 0.406 0.944 • The type and the quantity of the corpora is what really matters –> more tweet data, less news data • Bigram LMs performed slightly better than trigram LMs –> Unigram and character level LMs Deep Learning • Humor relies on creative use of language which causes too many OOV • Jokes often include puns based on invented words Barktender #DogJobs Tinderella #UpdateAFairyTale • Token-level LMs can not understand such puns • Character-based CNNs (CharCNN) are not dependent on observing tokens in training data • Bigram and trigram LMs only use two or three preceding words to predict the next word –> LSTMs are good at making use of sequantial data such as text and are designed for long-term dependencies • Some hashtags require tweets to have more than three words and some funny tweets are mostly made up of common bigrams or trigrams Complaining makes it better #AmericaIn4Words Romantic dinners with the cats #BestWeekendIn5Words • Ngram LMs do not include external knowledge such as movie titles and song lyrics –> Create word embeddings from domain specific materials • Our plan: use Keras library to train CharCNN + LSTM LMs on both datasets and investigate ways to include domain knowledge word embeddings in the CharCNN + LSTM LM References [1] Peter Potash, Alexey Romanov, and Anna Rumshisky. SemEval-2017 Task 6: #HashtagWars: learning a sense of humor. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, BC, August 2017. [2] EMNLP 2011 SIXTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION. http://www.statmt.org/wmt11/translation-task.html. [3] Kenneth Heafield, Ivan Pouzyrevsky, Jonathan H. Clark, and Philipp Koehn. Scalable modified Kneser-Ney language model estimation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, August 2013. [4] Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, and Lillian Lee. You had me at hello: How phrasing affects memorability. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 2012.