Event Detection via LDA for the MediaEval2012 SED Task

  • 398 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
398
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. MediaEval2012 Social Event Detection TaskEvent Detection via LDA for the MediaEval2012 SED Task Konstantinos N. Vavliakis Fani A. Tzima Pericles A. Mitkas Intelligent Systems and Software Engineering Labgroup http://issel.ee.auth.grInformation Technologies Electrical and ComputerInstitutes Engineering DepartmentCentre for Research and Aristotle University ofTechnology - Hellas Thessaloniki Thursday, 4 October 2012
  • 2. MediaEval2012 Social Event Detection Task Social Event Detection at MediaEval 2012 Goal: Discover social events 3 Challenges: 1. Find technical events in Germany 2. Find all soccer events in Hamburg (Germany) and Madrid (Spain) 3. Find demonstration and protest events of the Indignados movement in Madrid04/10/2012 2
  • 3. MediaEval2012 Social Event Detection Task Methodology Pre- City Topic Identification Event Event processing Classifier Detection Optimization Stemming (Porter stemmer) Split Translate Manually Create Topics Events (using Google (by location) Translate) Clean Text City Identify Select Identify Merge (remove stop Classifier Topics Relevant Events Events words/html (tf-idf for (per city, (by detecting (of consecutive tags) each city) using LDA) Topics peaks) days)04/10/2012 3
  • 4. MediaEval2012 Social Event Detection Task Preprocessing  Clean text by removing html tags and stop words  Translate non-English words  Perform stemming using the Porter Stemmer E.g.:Title Cleaned Title English Title Stemmedi-wall wall wall wall2009...Pallasso trist // Sad Clown pallasso trist sad clown clown sad sad clown clown sad sad clownConjunt Monumental de Sant conjunt monumental set monumental sant pere set monument sant perePere de Terrassa sant pere terrassa terrassa terrassaSeagull in the port seagull port seagull port seagul portWinter doesnt affect the small winter doesn affect winter doesn affect small winter doesn affect smallland of the gnomes - 9/365 small land gnomes land gnomes land gnomeJan-09 january january januariTidy chaos - 3/365 tidy chaos tidy chaos tidi chao 04/10/2012 4
  • 5. MediaEval2012 Social Event Detection Task City Classification5 cities TF-IDF values of the terms for each city Classified photos according to maximum TF-IDF aggregated valueUsers: Users can not be in more than 2 cities in the same day User statisticsResults: 4149 non classified photos Very good results for city classification, excellent at country level 04/10/2012 5
  • 6. MediaEval2012 Social Event Detection Task Topic Identification Manually Create Topics Photos of a Extract Topics Select Relevant City using LDA with Gibbs Sampling Topics ParticipationConcept in Topicsol 0.1544spanish 0.1116revolution 0.1050acampada 0.0983puerta 0.0262mayo 0.0243manifestación 0.0217…. 04/10/2012 6
  • 7. MediaEval2012 Social Event Detection Task Topic Selection Manually Create Topics Photos of a Extract Topics Select Relevant City using LDA with Gibbs Sampling Topics Each photos belongs to many topics Select photos containing “indignados” or “acampa” and sum their values per topic E.g.: PhotoID Topic Participation in Topic Sum Topic 18 456.58 5776147261 7 0.72 49 223.47 5776147261 14 0.12 0 27.13 5776147261 21 0.08 1 24.17 5776147261 6 0.02 22 23.39 5776147261 25 0.01 …. …. 04/10/2012 7
  • 8. MediaEval2012 Social Event Detection Task Event Detection & Optimization Event Detection  Find photos of selected topics  Count photos per day  If higher than a threshold add them to a new event Event Optimization  Merge events happening in consecutive days  Split events by geolocation distance04/10/2012 8
  • 9. MediaEval2012 Social Event Detection Task Results - C1: Technical events in Germany Precision Recall F-Measure NMI100 94.9 90 80.98 84.58 80 76.29 0.724 70 63.35 60 0.578 50.98 50 40.52 35.85 40 31.1 25.31 30 26.26 0.16 20 10 0 Selected/Total Selected/Total Selected/Total Manual Manual Topics: Topics: Topics: Topic Topic 2/50 6/50 8/50 04/10/2012 9
  • 10. MediaEval2012 Social Event Detection Task Results – C2: Soccer Events in Hamburg/Madrid Precision Recall F-Measure NMI 93.49100 93.49 86.67 91.21 88.18 90.76 88.18 0.847 90 84 81.78 0.85 75.72 77.67 0.768 80 70 60 50 40 30 20 10 0 Selected/Total Selected/Total Selected/Total Manual Manual Topics: Topics: Topics: Topic Topic 1/50 1/100 1/100 04/10/2012 10
  • 11. MediaEval2012 Social Event Detection Task Results – C3: Protest Events of Indignados Precision Recall F-Measure NMI100 90.78 90.78 90.76 86.59 85.38 88.91 88.91 90 88.53 89.83 84.29 86.11 80 73.8 70 60 50 40 0.33 0.347 30 20 10 0 Selected/Total Selected/Total Selected/Total Manual Manual Topics: Topics: Topics: Topic Topic 5/100 5/100 3/50 04/10/2012 11
  • 12. MediaEval2012 Social Event Detection Task Conclusions Effective and generalized methodology The selection of topics is the key Topics created by LDA close to manual topic’s results Really good precision Stemming may improve (slightly) the results Problems in “vague” topics04/10/2012 12
  • 13. MediaEval2012 Social Event Detection Task Relevant and Future Work Automatically detect all events from a dataset using detected topics Dynamic merging of topics The concept of important event is socially defined -> Personalized detection04/10/2012 13
  • 14. MediaEval2012 Social Event Detection Task Thank You! Email: kvavliak@issel.ee.auth.gr04/10/2012 14