Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CERTH @ MediaEval 2012 Social Event Detection Task

854 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

CERTH @ MediaEval 2012 Social Event Detection Task

  1. 1. CERTH @ MediaEval 2012 SocialEvent Detection TaskSymeon Papadopoulos, Georgios Petkos, Manos Schinas,Yiannis KompatsiarisPisa, 4-5 October 2011
  2. 2. The problem• Identify social events in tagged photos collections: – Challenge 1: Indignados protest @ Madrid – Challenge 2: Soccer matches @ Madrid, Hamburg – Challenge3: Technical Events @ Germany• Alternative formulation: – Represent a collection of photos as a graph, where items with high probability to belong to the same event are connected. – Each event forms a dense sub-graph in it. – Points to community detection as method to address the problem. 2
  3. 3. Approach Step 1 Step 2 Step 3 3
  4. 4. Graph Creation (1)• Graph creation is based on the use of “Same Class” model – A classifier which predicts whether two images belong to the same event or not – Support Vector Machine classifier trained with the data of the 2011 challenge – Input features: dissimilarities across user, title, tags, description, time taken, GIST, SURF/VLAD 4
  5. 5. Graph Creation (2)• Use the same class model to connect the items of the collection that belong to the same event• Retrieve candidate neighbours (~350) to reduce computational cost – 50 with respect to textual features – 150 with respect to time – 50 with respect to location (when it exists) – 100 with respect to visual features 5
  6. 6. Event Partitioning and Expansion (1)• Event partitioning – The nodes of the graph are clustered into candidate events by using the Structural Clustering Algorithm for Networks (SCAN). – The items clustered together by SCAN are used to obtain an aggregate representation of each candidate social event. – Split the candidate events that exceed a predefined time range into shorter events. 6
  7. 7. Event Partitioning and Expansion (2)• Expansion of the candidate events set – Each image that does not belong to any event forms a single-item event. – Merge these single-item events into larger clusters by checking location and time. – Add the new events in the set of the candidate events 7
  8. 8. Event Filtering (1)• Filter in two ways: – By using geo-location (if exists) – By using tag-based models• Geo-location Filtering – Discard events that don’t contained into the bounding box of the specific challenge – 30% of candidate events are discarded 8
  9. 9. Event Filtering (2)• Tag-based filtering – Build term models by finding the 500 dominant terms for the specific locations and event types. – we collect images from Flickr that are relevant to the location or the type of event of interest. – Images for Madrid, Hamburg and Germany – Images for indignados, soccer and technical events 9
  10. 10. Event Filtering (3)• Tag-based filtering – Probability of appearance – We compute the ratio of the probability of appearance in the focus set over the probability of appearance in the reference set. – Keep the 500 terms with the highest ratio – Jaccard similarity between a tag model and events terms 10
  11. 11. EvaluationNotationRun 1: Same class model trained with 10000 pairs of images.Run 2: Same class model trained with 30000 pairs of images.Run 3: Same class model of run 1 with post processing step 11
  12. 12. Discussion (1)• Moving from a smaller (run 1) to a larger (run 2) training dataset does not seem to improve most of the performance  over fitting• Method fails in challenge 1 because these events are different from these of the training dataset• A good tag model has to be used for classification in post-filtering step 12
  13. 13. Discussion (2)• Future actions: – train the same class model with a richer set of data – explore different graph construction strategies and community detection algorithms.• Ways to improve: – better topic classification methods – more sophisticated methods for location estimation 13
  14. 14. Questions 14

×