CERTH @ MediaEval 2012 Social Event Detection Task


Published on

1 Comment
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • But if not possible to match with any city, then don’t filter out the photo (bias towards higher recall).
  • CERTH @ MediaEval 2012 Social Event Detection Task

    1. 1. CERTH @ MediaEval 2012 SocialEvent Detection TaskManos Schinas, Georgios Petkos, Symeon Papadopoulos,Yiannis KompatsiarisPisa, 4-5 October 2012
    2. 2. The problem• Identify social events in tagged photos collections: – Challenge 1: Technical Events @ Germany – Challenge 2: Soccer matches @ Madrid, Hamburg – Challenge3: Indignados protest @ Madrid• Alternative formulation: – Represent a collection of photos as a graph, where items with high probability to belong to the same event are connected. – Each event forms a dense sub-graph in it. – Points to community detection as method to address the problem. 2
    3. 3. Approach Step 1 Step 2 Step 3 3
    4. 4. Graph Creation (1)• Graph creation is based on the use of “Same Class” model – A classifier which predicts whether two images belong to the same event or not – Support Vector Machine classifier trained with the data of the 2011 challenge – Input features: dissimilarities across user, title, tags, description, time taken, GIST, SURF/VLAD 4
    5. 5. Graph Creation (2)• Use the same class model to connect the items of the collection that belong to the same event• Retrieve candidate neighbours (~350) to reduce computational cost – 50 with respect to textual features – 150 with respect to time – 50 with respect to location (when it exists) – 100 with respect to visual features 5
    6. 6. Event Partitioning and Expansion (1)• Event partitioning – The nodes of the graph are clustered into candidate events by using the Structural Clustering Algorithm for Networks (SCAN). – The items clustered together by SCAN are used to obtain an aggregate representation of each candidate social event. – Split the candidate events that exceed a predefined time range into shorter events. 6
    7. 7. Event Partitioning and Expansion (2)• Expansion of the candidate events set – Each image that does not belong to any event forms a single-item event. – Merge these single-item events into larger clusters by checking location and time. – Add the new events in the set of the candidate events 7
    8. 8. Event Filtering (1)• Filter in two ways: – By using geo-location (if exists) – By using tag-based models• Geo-location Filtering – Discard events that don’t contained into the bounding box of the specific challenge – 30% of candidate events are discarded 8
    9. 9. Event Filtering (2)• Tag-based filtering – Build term models by finding the 500 dominant terms for the specific locations and event types. – we collect images from Flickr that are relevant to the location or the type of event of interest. – Images for Madrid, Hamburg and Germany – Images for indignados, soccer and technical events 9
    10. 10. Event Filtering (3)• Tag-based filtering – Probability of appearance – We compute the ratio of the probability of appearance in the focus set over the probability of appearance in the reference set. – Keep the 500 terms with the highest ratio – Jaccard similarity between a tag model and events terms 10
    11. 11. EvaluationNotationRun 1: Same class model trained with 10000 pairs of images.Run 2: Same class model trained with 30000 pairs of images.Run 3: Same class model of run 1 with post processing step 11
    12. 12. Discussion (1)• Moving from a smaller (run 1) to a larger (run 2) training dataset does not seem to improve most of the performance  over fitting• Method fails in challenge 1 because these events are different from these of the training dataset• A good tag model has to be used for classification in post-filtering step 12
    13. 13. Discussion (2)• Future actions: – train the same class model with a richer set of data – explore different graph construction strategies and community detection algorithms.• Ways to improve: – better topic classification methods – more sophisticated methods for location estimation 13
    14. 14. Questions 14