Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation
Social Event Detection (SED):Challenges, Dataset and Evaluation Raphaël Troncy <firstname.lastname@example.org> Vasileios Mezaris <email@example.com> Symeon Papadopoulos <firstname.lastname@example.org> Emmanouil Schinas <email@example.com> Ioannis Kompatsiaris <firstname.lastname@example.org>
What are Events? Events are observable occurrences grouping People Places Time Experiences documented by Media 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -2
SED: bigger, longer, harder In 2011 In 2012 2 challenges 3 challenges 73k photos (2,43 Gb) 1 from SED 2011 No training dataset 167k photos (5,5 Gb) cc licence check 18 teams interested 7 teams submitted runs Training dataset = SED 2011 Considered easy 21 teams interested F-measure = 85% … from 15 countries (challenge 1) 5 teams submitted runs F-measure = 69% (challenge 2) Much harder ! 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -3
Three challenges (type and venue)1. Find all technical events that took place in Germany in the test collection.2. Find all soccer events taking place in Hamburg (Germany) and Madrid (Spain) in the collection.3. Find all demonstration and protest events of the Indignados movement occurring in public places in Madrid in the collection For each event, we provided relevant and non relevant example photos Task = detect events and provide all illustrating photos 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -4
Dataset Construction Collect 167332 Flickr Photos (Jan 2009-Dec 2011) 4,422 unique Flickr users, all in CC licence All geo-tagged in 5 cities: Barcelona (72255), Cologne (15850), Hannover (2823), Hamburg (16958), Madrid (59043) + 0,22 % (403) from EventMedia Altered metadata: geo-tags removed for 80% of the photos (random) 33466 photos still geo-tagged Provide only metadata … but real media were available to participants if they asked (5,5 Gb) 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -5
Ground Truth and Evaluation Measures CrEve annotation tool: http://www.clusttour.gr/creve/ For each of the 6 collections, review all photos and associate them to events (that have to be created) Search by text, geo-coordinates, date and user Review annotations made by others Use EventMedia and machine tags (upcoming:event=xxx) Evaluation Measures: Harmonic mean (F-score of Precision and Recall) Normalized Mutual Information (NMI): jointly consider the goodness of the photos retrieved and their correct assignment to different events 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -6
Who Has Participated ? 21 Teams registered (18 in 2011) 5 Teams cross the lines (7 in 2011, 2 overlaps) One participant missing at the workshop! 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -8
Quick Summary of Approaches 2011: all but 1 participants use background knowledge Last.fm (all), Fbleague (EURECOM), PlayerHistory (QMUL) DBpedia, Freebase, Geonames, WordNet 2012: all but 2 participants use a generic approach IR approach: query matching clusters (metadata, temporal, spatial): MISIMIS Classification approach: Topic detection with LDA, city classification with TF-IDF, event detection using peaks in timeline using the query topics: AUTH-ISSEL Learning model using the training data and SVM: CERTH-ITI Background knowledge: QMUL, DISI 2012: all approaches are NOT fully automatic Manual selection of some parameters (e.g. topics) 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy -9
Conclusion Lessons learned Clear winner for all tasks: generic approach but manual selection of the topics Use of background knowledge still useful if well-used Looking at next year SED Shlomo Geva (Queensland University of Technology) + Philipp Cimiano (University of Bielefeld) Dataset: bigger, more diverse Media: photos and videos ? (at least 10% videos?) Metadata: include some social network relationships, participation at events Evaluation measures: event granularity? Time/CPU? 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 13
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.