Investigating Images Related to
Twitter Trending Topics
1
MUSTAFA ILKER SARAC
20801528

UNDERSTANDING AND CLASSIFYING IMAG...
Content
2

 Introduction
 Motivation
 Image-Tweets
 Image and Text Relation
 Visual/Non-Visual Classification
 Exper...
Introduction
3

 Image-tweets
 Correlation between tweet’s image and text
 50% of all posts are image-tweets
 Image tw...
Motivation
4

 Questions to ask
 What types of images do users embed?
 Do the images distinctly differ from images on i...
Image-Tweets
5

 Corpus
 Text-only and image-tweets from Weibo
 7 months in 2012
 ~57M tweets
 Manually annotated ~5K...
Image-Tweets
6

 Image Characteristics
 Images are post-processed by Weibo
 45.1% of the corpus are image-tweets
 Imag...
Image-Tweets
7

 Image-tweets vs. Text-only When? What? Why?
 More image-tweets during daytime – When?
 LDA applied to ...
Image and Text Relation
8

 99% of image tweets have text.
 Status (event, time ,location)
 Logico – semantic

CS531 - ...
Image and Text Relation
9

 Visually-relevant image-tweets
 At least one noun or verb corresponds to part of the image
...
Visual/Non-Visual Classification
10

 Dataset Construction
 Crowdsourcing to label a random subset of the image-tweets
V...
Visual/Non-Visual Classification
11

 Text Features
 Binary word features
 Previously learned topics from LDA
 Part of...
Visual/Non-Visual Classification
12

 Image features
 Face detection
 SIFT features with bag of visual words representa...
Experiment
13

 10 fold cross-validation with Naïve Bayes is

performed
 Macro-averaged F1 score is computed.
 Baseline...
Experiment
14

CS531 - Mustafa Ilker SARAC

1/13/2014
Proposed Work
15

 Re-rank images of image-tweets returned by Twitter

search
 Select good images in order to represent ...
Initial Results
16

CS531 - Mustafa Ilker SARAC

1/13/2014
Thank You
17

QUESTIONS?

CS531 - Mustafa Ilker SARAC

1/13/2014
Upcoming SlideShare
Loading in …5
×

CS531presentation

309 views

Published on

Based on paper Understanding and Classifying Image Tweets
ACM-MM 2013
Disclaimer: I am not any kind of author of this paper. I have used that paper as a basis for my course project proposal.

Published in: Economy & Finance
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
309
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • What is the difference
  • CS531presentation

    1. 1. Investigating Images Related to Twitter Trending Topics 1 MUSTAFA ILKER SARAC 20801528 UNDERSTANDING AND CLASSIFYING IMAGE TWEETS ACM-MM 2013 CS531 - Mustafa Ilker SARAC 1/13/2014
    2. 2. Content 2  Introduction  Motivation  Image-Tweets  Image and Text Relation  Visual/Non-Visual Classification  Experiments  Initial Results CS531 - Mustafa Ilker SARAC 1/13/2014
    3. 3. Introduction 3  Image-tweets  Correlation between tweet’s image and text  50% of all posts are image-tweets  Image tweets retweeted more and survived longer CS531 - Mustafa Ilker SARAC 1/13/2014
    4. 4. Motivation 4  Questions to ask  What types of images do users embed?  Do the images distinctly differ from images on image/photosharing websites like Flickr?  Do the textual contents of image tweets differ from posts that are text-only?  Contributions  Corpus  Annotated subset  Built a classifier to distinguish two subclasses of image-tweets; Visual  Non-Visual  CS531 - Mustafa Ilker SARAC 1/13/2014
    5. 5. Image-Tweets 5  Corpus  Text-only and image-tweets from Weibo  7 months in 2012  ~57M tweets  Manually annotated ~5K subset CS531 - Mustafa Ilker SARAC 1/13/2014
    6. 6. Image-Tweets 6  Image Characteristics  Images are post-processed by Weibo  45.1% of the corpus are image-tweets  Images vary by quality and topics  70% of annotated corpus are natural photograph. CS531 - Mustafa Ilker SARAC 1/13/2014
    7. 7. Image-Tweets 7  Image-tweets vs. Text-only When? What? Why?  More image-tweets during daytime – When?  LDA applied to a subset, ~1M, of corpus – What?   k=50 latent topics are learned Daily chatter or information sharing – Why? CS531 - Mustafa Ilker SARAC 1/13/2014
    8. 8. Image and Text Relation 8  99% of image tweets have text.  Status (event, time ,location)  Logico – semantic CS531 - Mustafa Ilker SARAC 1/13/2014
    9. 9. Image and Text Relation 9  Visually-relevant image-tweets  At least one noun or verb corresponds to part of the image  Non-visual image-tweets  Image and text has no visual correspondence  Hard to distinguish by just looking images  May exhibit emotional relevance CS531 - Mustafa Ilker SARAC 1/13/2014
    10. 10. Visual/Non-Visual Classification 10  Dataset Construction  Crowdsourcing to label a random subset of the image-tweets Visual  Non-visual    Each image is annotated by 3 different subjects 4811 image-tweets annotated 3206 (2/3) visual  1605 (1/3) non-visual   3 major types of features are used Text  Image  Context  CS531 - Mustafa Ilker SARAC 1/13/2014
    11. 11. Visual/Non-Visual Classification 11  Text Features  Binary word features  Previously learned topics from LDA  Part of Speech(POS) density features  Named Entities  Microblog specific features @mentions  #hashtags  Geolocation  URLs  CS531 - Mustafa Ilker SARAC 1/13/2014
    12. 12. Visual/Non-Visual Classification 12  Image features  Face detection  SIFT features with bag of visual words representation  Applied LDA with k=35  Context Features  Retweets  Comments  Follower Ratio  Posting Time etc. CS531 - Mustafa Ilker SARAC 1/13/2014
    13. 13. Experiment 13  10 fold cross-validation with Naïve Bayes is performed  Macro-averaged F1 score is computed.  Baseline is using only words as feature  F1 = 64.8  Each feature is combined individually to observe the impact.  When combined all positive features  F1 = 70.5 CS531 - Mustafa Ilker SARAC 1/13/2014
    14. 14. Experiment 14 CS531 - Mustafa Ilker SARAC 1/13/2014
    15. 15. Proposed Work 15  Re-rank images of image-tweets returned by Twitter search  Select good images in order to represent Trending Topics.  Twitter scraped and some initial results are obtained using    Retweets, Favorites for contextual features SIFT for image features to compare images. CS531 - Mustafa Ilker SARAC 1/13/2014
    16. 16. Initial Results 16 CS531 - Mustafa Ilker SARAC 1/13/2014
    17. 17. Thank You 17 QUESTIONS? CS531 - Mustafa Ilker SARAC 1/13/2014

    ×