SlideShare a Scribd company logo
SOCIAL METAPHOR
DETECTION VIA TOPICAL
ANALYSIS
Ting-Hao (Kenneth) Huang windx@cmu.edu
Language Technologies Institute, Carnegie Mellon University2013/10/13
2/17
Selectional Preference
 http://www.flickr.com/photos/uxud/320564052
4/
 http://www.flickr.com/photos/epc/313661914/
Verb has selectional preferences to its arguments.
Can you eat money?
3/17
Research QuestionsCould we capture metaphors
in social media
by selectional preference?
If yes, how?
Is it for verb only ?
If not, why not?
Could topic model help ?
4/17
Outline
 Selectional Preference
 3-Step Framework
1. Pre-processing
2. Modeling & Detection
3. Post-processing
 Topical Analysis
 Experiment & Result
 Conclusion
5/17
Selectional Preference
 Selectional Association (SA) (Resnik, 1997)
p: predicate
c: noun class
6/17
3-step Framework
Pre-processing -
Word Extraction &
Noun Clustering
Modeling & Detection -
SA Outlier Detection
Post-processing -
SA Strength Filter
7/17
Step 1: Pre-processing (1)
 Word Extraction
 Why?
 Parsing & POS tagging is hard on noisy data
 How?
 Using lemma form
 Set minimal term frequency
 Set minimal “POS rate”
 Proportion of occurrence of certain POS
 Predicates should be more strict than the nouns
 Noun: TF > 5, POS rate >= 0.7
 Verb & Adj: TF > 50, POS rate >= 0.8
8/17
Top 100
Similar Nouns
money:
1. funds
2. cash
3. profits
4. millions
5. monies
6. dollars
7. royalties
…
Weighted
Directed Graph
for Nouns
Spectral
Clustering
Step 1: Pre-processing (2)
 Semantic Noun Clustering
9/17
Step 2: Modeling & Detection
 Selectional Association
Another Candidate
Semantic Outlier Word Detection
 “Semantic Coherence” outlier
(Inkpen et al., 2005)
 Based on pair-wise word
semanic similarity
 Very High False Positive
 The influences of “general
words”
 Semantic similarity is not
reliable
Seafood
Fruit
Money
Human
Eat
…
!
SA1
SA2
SA(n-1)
SAn
10/17
Step 3: Post-processing
 SA Strength Filtering (Shutova, et al., 2010)
 SA Strength
 Strong (e.g., filmmake)
 Weak (e.g., “light verb”, put, take, …)
 Predicates with weak selectional preference barely
“violates” their own preference.
11/17
Topical Analysis
Entertainment
Food & Cook
Finance
Politics
Entertainment
Food & Cook
Politics
Finance
12/17
DataData  Online breast cancer support community
 All the public posts from Oct 2001 to Jan 2011.
 90,242 unique users who posted 1,562,459
messages belonging to 68,158 discussion
threads. (Wang, et al., 2012; Wen, et al., 2013)
13/17
Experiment Setting
Pre-processing -
- Stanford NLP/Parser
- 55k nouns, 3k adjs, and 1.8k verbs
Modeling & Detection -
- 3 deps: nsubj, dobj, amod
- Observe negative pairs
Post-processing -
- Follow (Shutova, et al., 2010)
Topical Model: JGibbLDA, 20 topics (k = 20)
14/17
Result
 Most outliers are NOT metaphors
 Parsing Error
“…yearly breast MRI…”: amod(breast, yearly)
 Non-metaphor
“…cancer cells float around in my blood…”: dobj(float, cancer)
 Metonymy
“If John win tomorrow night, …”: dobj(win, tomorrow)
 Only very few metaphors are identified
 “…keep my head occupied …”: nsubj(occupy, head)
 “… my belly has overtaken the boobs …”: nsubj(overtake, belly)
 Topic model does NOT help much
15/17
Discussion & ConclusionCould we capture metaphors
in social media by selectional preference?
If yes, how? Is it for verb only ?
If not, why not? Could topic model help ?
Maybe not by fully-automatic approaches.
Good parsing is challenging on social media.
Outliers of SA are not always metaphors.
Topic modeling does not help much.
Maybe seed-expansion method works better.
No, it could also work for amod dependency.
16/17
Thanks!
 Acknowledgement
 Zi Yang, Prof. Teruko Mitamura, Prof. Eric Nyberg for academic
supports, and Yi-Chia Wang , Dong Nguyen for data collection.
 Supported by the Intelligence Advanced Re-search Projects
Activity (IARPA) via Department of Defense US Army Research
Laboratory contract number W911NF-12-C-0020.
 Main References
 Resnik, P. (1997). Selectional preference and sense disambiguation.
In Proceedings of the ACL’97 SIGLEX Workshop.
 Shutova, E.; Sun, L.; and Korhonen, A. (2010). Metaphor
identification using verb and noun clustering. ACL’10.
 Wang, Y.-C., Kraut, R. E., & Levine, J. M. (2012). To Stay or Leave?
The Relationship of Emotional and Informational Support to
Commitment in Online Health Support Groups. CSCW'2012.
17/17
License
 Except where otherwise noted, content on this slide is licensed under a
Creative Commons. Attribution-ShareAlike 3.0 Unported License.
 Material used
 Backgournd image of slide 2: uxud@Flickr, “Plate of money”, CC BY 2.0
http://www.flickr.com/photos/uxud/3205640524/
 Backgournd image of slide 3, 15: jasonahowie@Flickr, “Social Media apps”,
CC BY 2.0
http://www.flickr.com/photos/jasonahowie/8583949219/
 Backgournd image of slide 12: susangkomenforthecure@Flickr, “Susan G.
Komen walkers gear up and take on Day 1 for breast cancer awareness.”, CC
BY-NC-ND 2.0
http://www.flickr.com/photos/susangkomenforthecure/9623480334/
 Backgournd image of slide 16: jam_project@Flickr, “IMG_0405 [2011-08-23]”,
CC BY-NC-ND 2.0
http://www.flickr.com/photos/jam_project/6075715482/

More Related Content

Similar to Social Metaphor Detection via Topical Analysis

Y12 res meth workbook hanan
Y12 res meth workbook hananY12 res meth workbook hanan
Y12 res meth workbook hanan
hma1
 
GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)
Emil Chuck
 
What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...
Bart Rienties
 
Assessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part HarmonyAssessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part Harmony
heasulli
 
1 8Annotated Bib
1                                         8Annotated Bib1                                         8Annotated Bib
1 8Annotated Bib
VannaJoy20
 
Mam in Second Life
Mam in Second LifeMam in Second Life
Mam in Second Life
Mark Bell
 
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docxAssignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
rock73
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
jstaaks
 
1315 estella ma_motorlearning
1315 estella ma_motorlearning1315 estella ma_motorlearning
1315 estella ma_motorlearning
Tian Stella
 
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
pmilano
 
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docxReference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
simisterchristen
 
Using student data to transform teaching and learning
Using student data to transform teaching and learningUsing student data to transform teaching and learning
Using student data to transform teaching and learning
Bart Rienties
 
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docxA PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
bartholomeocoombs
 
Response To Intervention - Tier One Strategies
Response To Intervention - Tier One StrategiesResponse To Intervention - Tier One Strategies
Response To Intervention - Tier One Strategies
Mike Fisher
 
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Jessica Henderson
 
AMait_CV_Feb2017
AMait_CV_Feb2017AMait_CV_Feb2017
AMait_CV_Feb2017
Alexander Mait, PMP, EIT
 
Polling Systems Presentation Edu
Polling Systems Presentation EduPolling Systems Presentation Edu
Polling Systems Presentation Edu
Barry Gregory
 
2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule
pmilano
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experience
lmfederer
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
Bram Zandbelt
 

Similar to Social Metaphor Detection via Topical Analysis (20)

Y12 res meth workbook hanan
Y12 res meth workbook hananY12 res meth workbook hanan
Y12 res meth workbook hanan
 
GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)GMU Preapplication and Competencies (NEAAHP 2011)
GMU Preapplication and Competencies (NEAAHP 2011)
 
What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...What have we learned from 6 years of implementing learning analytics amongst ...
What have we learned from 6 years of implementing learning analytics amongst ...
 
Assessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part HarmonyAssessing Science Learning In 3 Part Harmony
Assessing Science Learning In 3 Part Harmony
 
1 8Annotated Bib
1                                         8Annotated Bib1                                         8Annotated Bib
1 8Annotated Bib
 
Mam in Second Life
Mam in Second LifeMam in Second Life
Mam in Second Life
 
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docxAssignment Surveys and Response RatesAs you read in Chapter 1, .docx
Assignment Surveys and Response RatesAs you read in Chapter 1, .docx
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
 
1315 estella ma_motorlearning
1315 estella ma_motorlearning1315 estella ma_motorlearning
1315 estella ma_motorlearning
 
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule   f...
Final 2014 JACKSONVILLE UNIVERSITY faculty and student symposium schedule f...
 
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docxReference Summary WorksheetReference 1 – Cross-cultural referenc.docx
Reference Summary WorksheetReference 1 – Cross-cultural referenc.docx
 
Using student data to transform teaching and learning
Using student data to transform teaching and learningUsing student data to transform teaching and learning
Using student data to transform teaching and learning
 
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docxA PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
A PROCEDURE FOR IDENTIFYING PRECURSORS TOPROBLEM BEHAVIOR.docx
 
Response To Intervention - Tier One Strategies
Response To Intervention - Tier One StrategiesResponse To Intervention - Tier One Strategies
Response To Intervention - Tier One Strategies
 
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
Demystifying Data: Supporting Analyzing and Interpreting Data in the Classroom
 
AMait_CV_Feb2017
AMait_CV_Feb2017AMait_CV_Feb2017
AMait_CV_Feb2017
 
Polling Systems Presentation Edu
Polling Systems Presentation EduPolling Systems Presentation Edu
Polling Systems Presentation Edu
 
2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule2014 JU Faculty and Student Symposium schedule
2014 JU Faculty and Student Symposium schedule
 
Embedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA ExperienceEmbedded with the Scientists: The UCLA Experience
Embedded with the Scientists: The UCLA Experience
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 

More from Ting-Hao Huang

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
Ting-Hao Huang
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Ting-Hao Huang
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
Ting-Hao Huang
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
Ting-Hao Huang
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
Ting-Hao Huang
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
Ting-Hao Huang
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Ting-Hao Huang
 

More from Ting-Hao Huang (7)

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
 
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
 

Recently uploaded

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 

Recently uploaded (20)

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 

Social Metaphor Detection via Topical Analysis

  • 1. SOCIAL METAPHOR DETECTION VIA TOPICAL ANALYSIS Ting-Hao (Kenneth) Huang windx@cmu.edu Language Technologies Institute, Carnegie Mellon University2013/10/13
  • 2. 2/17 Selectional Preference  http://www.flickr.com/photos/uxud/320564052 4/  http://www.flickr.com/photos/epc/313661914/ Verb has selectional preferences to its arguments. Can you eat money?
  • 3. 3/17 Research QuestionsCould we capture metaphors in social media by selectional preference? If yes, how? Is it for verb only ? If not, why not? Could topic model help ?
  • 4. 4/17 Outline  Selectional Preference  3-Step Framework 1. Pre-processing 2. Modeling & Detection 3. Post-processing  Topical Analysis  Experiment & Result  Conclusion
  • 5. 5/17 Selectional Preference  Selectional Association (SA) (Resnik, 1997) p: predicate c: noun class
  • 6. 6/17 3-step Framework Pre-processing - Word Extraction & Noun Clustering Modeling & Detection - SA Outlier Detection Post-processing - SA Strength Filter
  • 7. 7/17 Step 1: Pre-processing (1)  Word Extraction  Why?  Parsing & POS tagging is hard on noisy data  How?  Using lemma form  Set minimal term frequency  Set minimal “POS rate”  Proportion of occurrence of certain POS  Predicates should be more strict than the nouns  Noun: TF > 5, POS rate >= 0.7  Verb & Adj: TF > 50, POS rate >= 0.8
  • 8. 8/17 Top 100 Similar Nouns money: 1. funds 2. cash 3. profits 4. millions 5. monies 6. dollars 7. royalties … Weighted Directed Graph for Nouns Spectral Clustering Step 1: Pre-processing (2)  Semantic Noun Clustering
  • 9. 9/17 Step 2: Modeling & Detection  Selectional Association Another Candidate Semantic Outlier Word Detection  “Semantic Coherence” outlier (Inkpen et al., 2005)  Based on pair-wise word semanic similarity  Very High False Positive  The influences of “general words”  Semantic similarity is not reliable Seafood Fruit Money Human Eat … ! SA1 SA2 SA(n-1) SAn
  • 10. 10/17 Step 3: Post-processing  SA Strength Filtering (Shutova, et al., 2010)  SA Strength  Strong (e.g., filmmake)  Weak (e.g., “light verb”, put, take, …)  Predicates with weak selectional preference barely “violates” their own preference.
  • 11. 11/17 Topical Analysis Entertainment Food & Cook Finance Politics Entertainment Food & Cook Politics Finance
  • 12. 12/17 DataData  Online breast cancer support community  All the public posts from Oct 2001 to Jan 2011.  90,242 unique users who posted 1,562,459 messages belonging to 68,158 discussion threads. (Wang, et al., 2012; Wen, et al., 2013)
  • 13. 13/17 Experiment Setting Pre-processing - - Stanford NLP/Parser - 55k nouns, 3k adjs, and 1.8k verbs Modeling & Detection - - 3 deps: nsubj, dobj, amod - Observe negative pairs Post-processing - - Follow (Shutova, et al., 2010) Topical Model: JGibbLDA, 20 topics (k = 20)
  • 14. 14/17 Result  Most outliers are NOT metaphors  Parsing Error “…yearly breast MRI…”: amod(breast, yearly)  Non-metaphor “…cancer cells float around in my blood…”: dobj(float, cancer)  Metonymy “If John win tomorrow night, …”: dobj(win, tomorrow)  Only very few metaphors are identified  “…keep my head occupied …”: nsubj(occupy, head)  “… my belly has overtaken the boobs …”: nsubj(overtake, belly)  Topic model does NOT help much
  • 15. 15/17 Discussion & ConclusionCould we capture metaphors in social media by selectional preference? If yes, how? Is it for verb only ? If not, why not? Could topic model help ? Maybe not by fully-automatic approaches. Good parsing is challenging on social media. Outliers of SA are not always metaphors. Topic modeling does not help much. Maybe seed-expansion method works better. No, it could also work for amod dependency.
  • 16. 16/17 Thanks!  Acknowledgement  Zi Yang, Prof. Teruko Mitamura, Prof. Eric Nyberg for academic supports, and Yi-Chia Wang , Dong Nguyen for data collection.  Supported by the Intelligence Advanced Re-search Projects Activity (IARPA) via Department of Defense US Army Research Laboratory contract number W911NF-12-C-0020.  Main References  Resnik, P. (1997). Selectional preference and sense disambiguation. In Proceedings of the ACL’97 SIGLEX Workshop.  Shutova, E.; Sun, L.; and Korhonen, A. (2010). Metaphor identification using verb and noun clustering. ACL’10.  Wang, Y.-C., Kraut, R. E., & Levine, J. M. (2012). To Stay or Leave? The Relationship of Emotional and Informational Support to Commitment in Online Health Support Groups. CSCW'2012.
  • 17. 17/17 License  Except where otherwise noted, content on this slide is licensed under a Creative Commons. Attribution-ShareAlike 3.0 Unported License.  Material used  Backgournd image of slide 2: uxud@Flickr, “Plate of money”, CC BY 2.0 http://www.flickr.com/photos/uxud/3205640524/  Backgournd image of slide 3, 15: jasonahowie@Flickr, “Social Media apps”, CC BY 2.0 http://www.flickr.com/photos/jasonahowie/8583949219/  Backgournd image of slide 12: susangkomenforthecure@Flickr, “Susan G. Komen walkers gear up and take on Day 1 for breast cancer awareness.”, CC BY-NC-ND 2.0 http://www.flickr.com/photos/susangkomenforthecure/9623480334/  Backgournd image of slide 16: jam_project@Flickr, “IMG_0405 [2011-08-23]”, CC BY-NC-ND 2.0 http://www.flickr.com/photos/jam_project/6075715482/