1. Social Event Detection (SED): Challenges, Dataset and Evaluation RaphaƫlTroncy<raphael.troncy@eurecom.fr> VasileiosMezaris<bmezaris@iti.gr>Symeon Papadopoulos <papadop@iti.gr>Benoit Huet<benoit.huet@eurecom.fr> IoannisKompatsiaris<ikom@iti.gr>
2. What are Events? Events are observable occurrences grouping 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 2 People Places Time Experiences documented by Media
3. Two challenges (type and venue) Find all soccer events taking place in Barcelona (Spain) and Rome (Italy) in the collection. For each event provide all photos associated with it Find all events that took place in May 2009 in the venue named Paradiso (in Amsterdam, NL) and in the Parc del Forum (in Barcelona, Spain). For each event provide all photosassociated with it 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 3
4. Dataset Construction Collect 73,645 Flickr Photos (May 2009) 98,3% geo-tagged in 5 cities (Amsterdam, Barcelona, London, Paris and Rome) 1,7% (1294) non geo-tagged from EventMedia Altered metadata: geo-tags removed for 80% of the photos (random) 14,465 photos still geo-tagged Provide only metadata ā¦ but real media were available to participants if they asked (2,43 GB) 1% (697) photos disappeared in June-July 2011 Who studied the volatility of Flickr photos? 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 4
5. Ground Truth and Evaluation Measures Ground Truth Use EventMedia and machine tags (lastfm:event=xxx) Manual lookup at photos from Amsterdam and Barcelona Discussion for the corner cases 14 photos discussed for challenge 1 No time for discussion for challenge 2 (single assessor) Evaluation Measures: Harmonic mean (F-score of Precision and Recall) Normalized Mutual Information (NMI): jointly consider the goodness of the photos retrieved and their correct assignment to different events 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 5
6. Who Has Participated ? 18 Teams registered 7 Teams cross the lines Everybody is present at the workshop! 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 6
7. Quick Summary of Approaches Up to 5 non-constrained runs per challenge All participants use background knowledge Last.fm (all), Fbleague (EURECOM), PlayerHistory (QMUL) DBpedia, Freebase, Geonames, WordNet Classification vs Information Retrieval approach: City classifier, topic/venue classifier (CERTH) Linear SVM (QMUL) Latent Dirichelet Allocation (LIA) Hybrid (ANU) Image processing: ITI, EURECOM and ANU 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 7
12. Conclusion It was an easy task ļ ā¦ BUT people had fun Looking at next year SED Dataset: bigger, more diverse, training vs test sets Media: photos and videos Metadata: include some social network relationships, participation at events Challenges: detect personal events Evaluation measures: event granularity? Time/CPU? ā¦ 01/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 12
14. What are Events? Events are observable occurrences grouping ā¦ and announced on the WEB ! 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 14 People Places Time Experiences documented by Media
15. Approach Get background knowledge about occurrences of past events Information retrieval approach Event information model // metadata + photo query 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 15 Query Event Occurrences Matching
16. Which Prior Knowledge? Challenge 1 6 past football games in Barcelona and Roma Challenge 2 68 past events recorded in Paradiso and Parc del Forum 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 16
17. Event Model and Photo Query 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 17 E = {title, geo, time} Event P = {text, geo, time}
18. Matching Process Given a photo P and an event Ewhere Ī“isthe Dirac delta function N is used for scaling (vary depending on the run) 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 18 p(P|E) = p(P.text|E.title) p(P.geo|E.geo)p(P.time|E.time)
19. Visual Pruning and Owner Refinement Are photos taken at the event visually similar? Low-level features used: Color moments, Gabor texture, Edge histogram L1 distance on the K-nearest neighbors Photos sorted according to the distance Experimentally, we remove the 5% photos that are far away from the center in the visual feature space Confidence that a media sharer attended an event Effective way to deal with photos without any textual description 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 19
20. Challenge 1 Run 1: basic Event Identification Model (N=3) Run 2: run 1 + Owner Refinement Photos for 2 games while we had knowledge for 6 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 20
21. Challenge 2 Run 1 / 3: basic Event Identification Model (N=1) / (N=3) Run 2 / 4: run 1 / run 3 + Owner Refinement Run 5: run 3 + Visual Pruning + Owner Refinement 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 21
23. Challenge 2 Results ā Parc del Forum 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 23
24. Conclusion Event information model using background knowledge: Dedicated resources for Sport Events General event directories for Popular Venues Querying photos with occurrences of past events Importance of time for structuring media collection The way we used visual analysis didnāt add any value 02/09/2011 - Social Event Detection (SED) Task - MediaEval 2011, Pisa, Italy - 24