SlideShare a Scribd company logo
1 of 17
Download to read offline
Investigating Images Related to
Twitter Trending Topics
1
MUSTAFA ILKER SARAC
20801528

UNDERSTANDING AND CLASSIFYING IMAGE
TWEETS
ACM-MM 2013

CS531 - Mustafa Ilker SARAC

1/13/2014
Content
2

 Introduction
 Motivation
 Image-Tweets
 Image and Text Relation
 Visual/Non-Visual Classification
 Experiments
 Initial Results

CS531 - Mustafa Ilker SARAC

1/13/2014
Introduction
3

 Image-tweets
 Correlation between tweet’s image and text
 50% of all posts are image-tweets
 Image tweets retweeted more and survived longer

CS531 - Mustafa Ilker SARAC

1/13/2014
Motivation
4

 Questions to ask
 What types of images do users embed?
 Do the images distinctly differ from images on image/photosharing websites like Flickr?
 Do the textual contents of image tweets differ from posts that
are text-only?

 Contributions
 Corpus
 Annotated subset
 Built a classifier to distinguish two subclasses of image-tweets;
Visual
 Non-Visual


CS531 - Mustafa Ilker SARAC

1/13/2014
Image-Tweets
5

 Corpus
 Text-only and image-tweets from Weibo
 7 months in 2012
 ~57M tweets
 Manually annotated ~5K subset

CS531 - Mustafa Ilker SARAC

1/13/2014
Image-Tweets
6

 Image Characteristics
 Images are post-processed by Weibo
 45.1% of the corpus are image-tweets
 Images vary by quality and topics


70% of annotated corpus are natural photograph.

CS531 - Mustafa Ilker SARAC

1/13/2014
Image-Tweets
7

 Image-tweets vs. Text-only When? What? Why?
 More image-tweets during daytime – When?
 LDA applied to a subset, ~1M, of corpus – What?




k=50 latent topics are learned

Daily chatter or information sharing – Why?

CS531 - Mustafa Ilker SARAC

1/13/2014
Image and Text Relation
8

 99% of image tweets have text.
 Status (event, time ,location)
 Logico – semantic

CS531 - Mustafa Ilker SARAC

1/13/2014
Image and Text Relation
9

 Visually-relevant image-tweets
 At least one noun or verb corresponds to part of the image
 Non-visual image-tweets
 Image and text has no visual correspondence
 Hard to distinguish by just looking images
 May exhibit emotional relevance

CS531 - Mustafa Ilker SARAC

1/13/2014
Visual/Non-Visual Classification
10

 Dataset Construction
 Crowdsourcing to label a random subset of the image-tweets
Visual
 Non-visual





Each image is annotated by 3 different subjects
4811 image-tweets annotated
3206 (2/3) visual
 1605 (1/3) non-visual




3 major types of features are used
Text
 Image
 Context


CS531 - Mustafa Ilker SARAC

1/13/2014
Visual/Non-Visual Classification
11

 Text Features
 Binary word features
 Previously learned topics from LDA
 Part of Speech(POS) density features
 Named Entities
 Microblog specific features
@mentions
 #hashtags
 Geolocation
 URLs


CS531 - Mustafa Ilker SARAC

1/13/2014
Visual/Non-Visual Classification
12

 Image features
 Face detection
 SIFT features with bag of visual words representation


Applied LDA with k=35

 Context Features
 Retweets
 Comments
 Follower Ratio
 Posting Time etc.

CS531 - Mustafa Ilker SARAC

1/13/2014
Experiment
13

 10 fold cross-validation with Naïve Bayes is

performed
 Macro-averaged F1 score is computed.
 Baseline is using only words as feature


F1 = 64.8

 Each feature is combined individually to observe the

impact.
 When combined all positive features


F1 = 70.5

CS531 - Mustafa Ilker SARAC

1/13/2014
Experiment
14

CS531 - Mustafa Ilker SARAC

1/13/2014
Proposed Work
15

 Re-rank images of image-tweets returned by Twitter

search
 Select good images in order to represent Trending
Topics.
 Twitter scraped and some initial results are obtained
using




Retweets,
Favorites for contextual features
SIFT for image features to compare images.

CS531 - Mustafa Ilker SARAC

1/13/2014
Initial Results
16

CS531 - Mustafa Ilker SARAC

1/13/2014
Thank You
17

QUESTIONS?

CS531 - Mustafa Ilker SARAC

1/13/2014

More Related Content

More from mustafa sarac

Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpymustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizmustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mimustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Marketsmustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimimustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tshmustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guidemustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dicemustafa sarac
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentmustafa sarac
 
Teach yourself logic 2017
Teach yourself logic 2017Teach yourself logic 2017
Teach yourself logic 2017mustafa sarac
 

More from mustafa sarac (20)

Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
The book of Why
The book of WhyThe book of Why
The book of Why
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatment
 
On System Design
On System DesignOn System Design
On System Design
 
The first 100
The first 100 The first 100
The first 100
 
Teach yourself logic 2017
Teach yourself logic 2017Teach yourself logic 2017
Teach yourself logic 2017
 

Recently uploaded

ekthesi-trapeza-tis-ellados-gia-2023.pdf
ekthesi-trapeza-tis-ellados-gia-2023.pdfekthesi-trapeza-tis-ellados-gia-2023.pdf
ekthesi-trapeza-tis-ellados-gia-2023.pdfSteliosTheodorou4
 
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfKempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfHenry Tapper
 
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτος
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτοςΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτος
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτοςNewsroom8
 
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshare
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshareAon-UK-DC-Pension-Tracker-Q1-2024. slideshare
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshareHenry Tapper
 
Gender and caste discrimination in india
Gender and caste discrimination in indiaGender and caste discrimination in india
Gender and caste discrimination in indiavandanasingh01072003
 
Liquidity Decisions in Financial management
Liquidity Decisions in Financial managementLiquidity Decisions in Financial management
Liquidity Decisions in Financial managementshrutisingh143670
 
Global Economic Outlook, 2024 - Scholaride Consulting
Global Economic Outlook, 2024 - Scholaride ConsultingGlobal Economic Outlook, 2024 - Scholaride Consulting
Global Economic Outlook, 2024 - Scholaride Consultingswastiknandyofficial
 
What is sip and What are its Benefits in 2024
What is sip and What are its Benefits in 2024What is sip and What are its Benefits in 2024
What is sip and What are its Benefits in 2024prajwalgopocket
 
Hello this ppt is about seminar final project
Hello this ppt is about seminar final projectHello this ppt is about seminar final project
Hello this ppt is about seminar final projectninnasirsi
 
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptx
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptxOAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptx
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptxhiddenlevers
 
Introduction to Health Economics Dr. R. Kurinji Malar.pptx
Introduction to Health Economics Dr. R. Kurinji Malar.pptxIntroduction to Health Economics Dr. R. Kurinji Malar.pptx
Introduction to Health Economics Dr. R. Kurinji Malar.pptxDrRkurinjiMalarkurin
 
The Inspirational Story of Julio Herrera Velutini - Global Finance Leader
The Inspirational Story of Julio Herrera Velutini - Global Finance LeaderThe Inspirational Story of Julio Herrera Velutini - Global Finance Leader
The Inspirational Story of Julio Herrera Velutini - Global Finance LeaderArianna Varetto
 
Banking: Commercial and Central Banking.pptx
Banking: Commercial and Central Banking.pptxBanking: Commercial and Central Banking.pptx
Banking: Commercial and Central Banking.pptxANTHONYAKINYOSOYE1
 
Building pressure? Rising rents, and what to expect in the future
Building pressure? Rising rents, and what to expect in the futureBuilding pressure? Rising rents, and what to expect in the future
Building pressure? Rising rents, and what to expect in the futureResolutionFoundation
 
2B Nation-State.pptx contemporary world nation
2B  Nation-State.pptx contemporary world nation2B  Nation-State.pptx contemporary world nation
2B Nation-State.pptx contemporary world nationko9240888
 
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance Verification
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance VerificationCrypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance Verification
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance VerificationAny kyc Account
 
Thoma Bravo Equity - Presentation Pension Fund
Thoma Bravo Equity - Presentation Pension FundThoma Bravo Equity - Presentation Pension Fund
Thoma Bravo Equity - Presentation Pension FundAshwinJey
 
10 QuickBooks Tips 2024 - Globus Finanza.pdf
10 QuickBooks Tips 2024 - Globus Finanza.pdf10 QuickBooks Tips 2024 - Globus Finanza.pdf
10 QuickBooks Tips 2024 - Globus Finanza.pdfglobusfinanza
 
Money Forward Integrated Report “Forward Map” 2024
Money Forward Integrated Report “Forward Map” 2024Money Forward Integrated Report “Forward Map” 2024
Money Forward Integrated Report “Forward Map” 2024Money Forward
 
2024-04-09 - Pension Playpen roundtable - slides.pptx
2024-04-09 - Pension Playpen roundtable - slides.pptx2024-04-09 - Pension Playpen roundtable - slides.pptx
2024-04-09 - Pension Playpen roundtable - slides.pptxHenry Tapper
 

Recently uploaded (20)

ekthesi-trapeza-tis-ellados-gia-2023.pdf
ekthesi-trapeza-tis-ellados-gia-2023.pdfekthesi-trapeza-tis-ellados-gia-2023.pdf
ekthesi-trapeza-tis-ellados-gia-2023.pdf
 
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfKempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
 
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτος
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτοςΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτος
ΤτΕ: Ανάπτυξη 2,3% και πληθωρισμός 2,8% φέτος
 
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshare
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshareAon-UK-DC-Pension-Tracker-Q1-2024. slideshare
Aon-UK-DC-Pension-Tracker-Q1-2024. slideshare
 
Gender and caste discrimination in india
Gender and caste discrimination in indiaGender and caste discrimination in india
Gender and caste discrimination in india
 
Liquidity Decisions in Financial management
Liquidity Decisions in Financial managementLiquidity Decisions in Financial management
Liquidity Decisions in Financial management
 
Global Economic Outlook, 2024 - Scholaride Consulting
Global Economic Outlook, 2024 - Scholaride ConsultingGlobal Economic Outlook, 2024 - Scholaride Consulting
Global Economic Outlook, 2024 - Scholaride Consulting
 
What is sip and What are its Benefits in 2024
What is sip and What are its Benefits in 2024What is sip and What are its Benefits in 2024
What is sip and What are its Benefits in 2024
 
Hello this ppt is about seminar final project
Hello this ppt is about seminar final projectHello this ppt is about seminar final project
Hello this ppt is about seminar final project
 
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptx
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptxOAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptx
OAT_RI_Ep18 WeighingTheRisks_Mar24_GlobalCredit.pptx
 
Introduction to Health Economics Dr. R. Kurinji Malar.pptx
Introduction to Health Economics Dr. R. Kurinji Malar.pptxIntroduction to Health Economics Dr. R. Kurinji Malar.pptx
Introduction to Health Economics Dr. R. Kurinji Malar.pptx
 
The Inspirational Story of Julio Herrera Velutini - Global Finance Leader
The Inspirational Story of Julio Herrera Velutini - Global Finance LeaderThe Inspirational Story of Julio Herrera Velutini - Global Finance Leader
The Inspirational Story of Julio Herrera Velutini - Global Finance Leader
 
Banking: Commercial and Central Banking.pptx
Banking: Commercial and Central Banking.pptxBanking: Commercial and Central Banking.pptx
Banking: Commercial and Central Banking.pptx
 
Building pressure? Rising rents, and what to expect in the future
Building pressure? Rising rents, and what to expect in the futureBuilding pressure? Rising rents, and what to expect in the future
Building pressure? Rising rents, and what to expect in the future
 
2B Nation-State.pptx contemporary world nation
2B  Nation-State.pptx contemporary world nation2B  Nation-State.pptx contemporary world nation
2B Nation-State.pptx contemporary world nation
 
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance Verification
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance VerificationCrypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance Verification
Crypto Confidence Unlocked: AnyKYCaccount's Shortcut to Binance Verification
 
Thoma Bravo Equity - Presentation Pension Fund
Thoma Bravo Equity - Presentation Pension FundThoma Bravo Equity - Presentation Pension Fund
Thoma Bravo Equity - Presentation Pension Fund
 
10 QuickBooks Tips 2024 - Globus Finanza.pdf
10 QuickBooks Tips 2024 - Globus Finanza.pdf10 QuickBooks Tips 2024 - Globus Finanza.pdf
10 QuickBooks Tips 2024 - Globus Finanza.pdf
 
Money Forward Integrated Report “Forward Map” 2024
Money Forward Integrated Report “Forward Map” 2024Money Forward Integrated Report “Forward Map” 2024
Money Forward Integrated Report “Forward Map” 2024
 
2024-04-09 - Pension Playpen roundtable - slides.pptx
2024-04-09 - Pension Playpen roundtable - slides.pptx2024-04-09 - Pension Playpen roundtable - slides.pptx
2024-04-09 - Pension Playpen roundtable - slides.pptx
 

CS531presentation

  • 1. Investigating Images Related to Twitter Trending Topics 1 MUSTAFA ILKER SARAC 20801528 UNDERSTANDING AND CLASSIFYING IMAGE TWEETS ACM-MM 2013 CS531 - Mustafa Ilker SARAC 1/13/2014
  • 2. Content 2  Introduction  Motivation  Image-Tweets  Image and Text Relation  Visual/Non-Visual Classification  Experiments  Initial Results CS531 - Mustafa Ilker SARAC 1/13/2014
  • 3. Introduction 3  Image-tweets  Correlation between tweet’s image and text  50% of all posts are image-tweets  Image tweets retweeted more and survived longer CS531 - Mustafa Ilker SARAC 1/13/2014
  • 4. Motivation 4  Questions to ask  What types of images do users embed?  Do the images distinctly differ from images on image/photosharing websites like Flickr?  Do the textual contents of image tweets differ from posts that are text-only?  Contributions  Corpus  Annotated subset  Built a classifier to distinguish two subclasses of image-tweets; Visual  Non-Visual  CS531 - Mustafa Ilker SARAC 1/13/2014
  • 5. Image-Tweets 5  Corpus  Text-only and image-tweets from Weibo  7 months in 2012  ~57M tweets  Manually annotated ~5K subset CS531 - Mustafa Ilker SARAC 1/13/2014
  • 6. Image-Tweets 6  Image Characteristics  Images are post-processed by Weibo  45.1% of the corpus are image-tweets  Images vary by quality and topics  70% of annotated corpus are natural photograph. CS531 - Mustafa Ilker SARAC 1/13/2014
  • 7. Image-Tweets 7  Image-tweets vs. Text-only When? What? Why?  More image-tweets during daytime – When?  LDA applied to a subset, ~1M, of corpus – What?   k=50 latent topics are learned Daily chatter or information sharing – Why? CS531 - Mustafa Ilker SARAC 1/13/2014
  • 8. Image and Text Relation 8  99% of image tweets have text.  Status (event, time ,location)  Logico – semantic CS531 - Mustafa Ilker SARAC 1/13/2014
  • 9. Image and Text Relation 9  Visually-relevant image-tweets  At least one noun or verb corresponds to part of the image  Non-visual image-tweets  Image and text has no visual correspondence  Hard to distinguish by just looking images  May exhibit emotional relevance CS531 - Mustafa Ilker SARAC 1/13/2014
  • 10. Visual/Non-Visual Classification 10  Dataset Construction  Crowdsourcing to label a random subset of the image-tweets Visual  Non-visual    Each image is annotated by 3 different subjects 4811 image-tweets annotated 3206 (2/3) visual  1605 (1/3) non-visual   3 major types of features are used Text  Image  Context  CS531 - Mustafa Ilker SARAC 1/13/2014
  • 11. Visual/Non-Visual Classification 11  Text Features  Binary word features  Previously learned topics from LDA  Part of Speech(POS) density features  Named Entities  Microblog specific features @mentions  #hashtags  Geolocation  URLs  CS531 - Mustafa Ilker SARAC 1/13/2014
  • 12. Visual/Non-Visual Classification 12  Image features  Face detection  SIFT features with bag of visual words representation  Applied LDA with k=35  Context Features  Retweets  Comments  Follower Ratio  Posting Time etc. CS531 - Mustafa Ilker SARAC 1/13/2014
  • 13. Experiment 13  10 fold cross-validation with Naïve Bayes is performed  Macro-averaged F1 score is computed.  Baseline is using only words as feature  F1 = 64.8  Each feature is combined individually to observe the impact.  When combined all positive features  F1 = 70.5 CS531 - Mustafa Ilker SARAC 1/13/2014
  • 14. Experiment 14 CS531 - Mustafa Ilker SARAC 1/13/2014
  • 15. Proposed Work 15  Re-rank images of image-tweets returned by Twitter search  Select good images in order to represent Trending Topics.  Twitter scraped and some initial results are obtained using    Retweets, Favorites for contextual features SIFT for image features to compare images. CS531 - Mustafa Ilker SARAC 1/13/2014
  • 16. Initial Results 16 CS531 - Mustafa Ilker SARAC 1/13/2014
  • 17. Thank You 17 QUESTIONS? CS531 - Mustafa Ilker SARAC 1/13/2014

Editor's Notes

  1. What is the difference