Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Social	Media	Analy.cs	for		
Graph-Based	Event	Detec.on	
Dr.	Yiannis	Kompatsiaris,	ikom@i2.gr	
Mul$media,	Knowledge	and	Soc...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
11th	Interna*onal	Workshop	on	Seman*c	and	Social	
Media	Adapta*on	and	Personaliza*on	(SMAP	2016)	
Graph-Based	Event	Detec*...
Thank	you	for	your	aWen.on!	
ikom@i..gr	
hWp://mklab.i..gr
Upcoming SlideShare
Loading in …5
×

Social Media Analytics for Graph-Based Event Detection

686 views

Published on

The pervasive use of Online Social Networks (OSN) for networking, communication and search in tandem with the ubiquitous availability of smartphones, which enables real-time multimedia capturing and sharing, have led to massive amounts of user-generated content and activities being amassed online, and made publicly available for analysis and mining. Each content item is associated with an abundance of metadata and related information such as location, tags, comments, favorites and mood indicators, access logs, and so on. At the same time, all this information is implicitly or explicitly interconnected based on various properties such as social links among users, groups, communities, and sharing patterns. These properties transform social media into data sources of an extremely dynamic nature that reflect topics of interests, events, and the evolution of community opinion and focus. Social media processing offers a unique opportunity to structure and extract information and to benefit multiple areas ranging from new media experiences to psychology and marketing. The objective of this talk is to provide an overview of the current research in emerging topics related to applications where social media can act as sensors of real-life phenomena and case studies that reveal valuable insights. After discussing challenges and presenting a generic conceptual architecture, there will be a focus on efficient processing and indexing algorithms that can handle massive amounts of content with application to graph-based event detection and summarization in social media streams.

Published in: Science
  • Be the first to comment

Social Media Analytics for Graph-Based Event Detection

  1. 1. Social Media Analy.cs for Graph-Based Event Detec.on Dr. Yiannis Kompatsiaris, ikom@i2.gr Mul$media, Knowledge and Social Media Analy$cs Lab, Head CERTH-ITI 11th Interna.onal Workshop on Seman.c and Social Media Adapta.on and Personaliza.on
  2. 2. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Overview •  Introduc.on –  Mo.va.on – Challenges •  Real-world events in Social Media PlaIorms –  Detec.on (Discovery) –  Monitoring (Representa.on) –  Tracking (Evolu.on) •  Approaches –  “Same-event” model –  Visual event summariza.on –  Incremental Large-Scale Event Summariza.on •  Conclusions 2
  3. 3. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 3 Pope Francis Pope Benedict 2007: iPhone release 2008: Android release 2010: iPad release hWp://petapixel.com/2013/03/14/a-starry-sea-of-cameras-at-the-unveiling-of-pope-francis/
  4. 4. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 4 2016: US Presen.al Elec.ons
  5. 5. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 5 hWp://blog.tyronesystems.com/how-much-data-is-created-every-minute-by-the-social-media
  6. 6. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Caption Time User Profile Favs Comms Tags Social Media aspects and context
  7. 7. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 7 rise of the networks
  8. 8. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Mul2-modal graphs #
  9. 9. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Social Networks as Graphs
  10. 10. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 10 Social Networks as Real-Life Sensors •  Social Networks is a data source with an extremely dynamic nature that reflects events and the evolu.on of community focus (user’s interests) •  Huge smartphones and mobile devices penetra2on provides real-.me and loca.on-based user feedback •  Transform individually rare but collec2vely frequent media to meaningful topics, events, points of interest, emo.onal states and social connec.ons •  Present in an efficient way for a variety of applica.ons (news, marke.ng, science, health, entertainment)
  11. 11. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 11 Real-life Social Networks •  Social networks have emergent proper2es. Emergent proper.es are new aWributes of a whole that arise from the interac.on and interconnec.on of the parts •  Emo.ons, Health, Sexual rela.onships depend on our connec2ons (e.g. number of them) and on our posi2on - structure in the social graph •  Central – Hub •  Outlier •  Transi.vity (connec.ons between friends)
  12. 12. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Examples - Science Xin Jin, Andrew Gallagher, Liangliang Cao, Jiebo Luo, and Jiawei Han. The wisdom of social mul*media: using flickr for predic*on and forecast, Interna.onal conference on Mul.media (MM '10). ACM. 12 “…if you're more than 100 km away from the epicenter [of an earthquake] you can read about the quake on twiWer before it hits you…” Many twiWer examples at: What can TwiWer tell us about the real world? TwiWer and the Real World CIKM'13 Tutorial, hWps://sites.google.com/site/twiWerandtherealworld/home
  13. 13. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Examples - Science 13
  14. 14. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Example – News (Boston Marathon bombing - 2013) 14 “Following the Boston Marathon bombings, one quarter of Americans reportedly looked to Facebook, TwiWer and other social networking sites for informa.on, according to The Pew Research Center. When the Boston Police Department posted its final “CAPTURED!!!” tweet of the manhunt, more than 140,000 people retweeted it.” “Authori.es have recognized that one the first places people go in events like this is to social media, to see what the crowd is saying about what to do next” "I have been following my friend's Facebook [account] who is near the scene and she is upda2ng everyone before it even gets to the news”
  15. 15. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Example – Crisis – Humanitarian (Syria) 15 Syria Tracker offers a crisis mapping system that uses crowdsourced text, photo and video reports and data mining techniques forming a live map of the Syrian conflict since March 2011 …stream of content-filtered media from news, social media (TwiWer and Facebook) and official sources
  16. 16. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Citybeat: Visualizing the Social Media Pulse of the City 16 Citybeat sources, monitors and analyzes hyper-local informa.on from mul.ple social media plaIorms – hWp://thecitybeat.org
  17. 17. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Many other examples: smellymaps 17 Smell related words in geo-located social media hWp://researchswinger.org/smellymaps/
  18. 18. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Be careful of correla2on diagrams 18
  19. 19. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 19 API Wrapper Website Wrapper Scheduler CRAWLING Visual Indexing Near-duplicates Text Indexing INDEXING Media Fetcher SNA Sen2ment - Influence Trends - Topics MINING Model Building Concepts Relevance Diversity Popularity RANKING Veracity Crawling Specs Sources Interac2on Responsiveness Aggrega2on VISUALIZATION Aesthe2cs Conceptual Architecture
  20. 20. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 20 Challenges – Content (Indexing - Mining) • Mul2-modality: e.g. image + tags, video, audio • Rich social context: spa.o-temporal, social connec.ons, rela.ons and social graph • Specific messages: short, conversa.ons, errors, no context • Inconsistent quality: noise, spam, fake, propaganda • Huge volume: Massively produced and disseminated • Mul2-source: may be generated by different applica.ons and user communi.es • Dynamic: Fast updates, real-.me
  21. 21. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Policy – Licensing – Legal challenges •  Fragmented access to data –  Separate wrappers/APIs for each source (TwiWer, Facebook, etc.) –  Different data collec.on/crawling policies •  Limita.ons imposed by API providers (“Walled Gardens”) •  Full access to data impossible or extremely expensive (e.g. see data licensing plans for GNIP and DataSiu) •  Non-transparent data access prac.ces (e.g. access is provided to an organiza.on/person if they have a contact in TwiWer) •  Constant change of model and ToS of social APIs –  No backwards compa.bility, addi.onal development costs •  Ephemeral nature of content •  Social search results ouen lead to removed content à inconsistent and unreliable referencing •  User Privacy & Purpose of use •  Fuzzy regulatory framework regarding mining user-contributed data 21
  22. 22. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Real-world Events in Social Media Plajorms
  23. 23. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Large-scale real world events (1) •  Long-running events → Consist of several sub-events e.g. 10 days of Sundance Film Fes.val include opening and awards ceremonies, screenings etc. •  A lot of involved persons that use social media → huge amount of event-related micro-blogging messages •  A growing number of these messages carry mul2media content –  The existence of an image in a micro-post can convey a much beWer impression for the specific moment of the ongoing event
  24. 24. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Large-scale real world events (2) #nbafinals → 2.6M tweets in one month #Bal2moreRiots 29 April-2 May 2015 à1.3M tweets in 5 days E3 conference 2015 16-18 June >5M tweets before conference 2M tweets during conference new game releases à mul2media content
  25. 25. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Social Event Detec2on (Discovery) - Detec.on of social events within social media collec.ons - Usually mul.media content SED can be seen as a clustering problem Different event types e.g. news, personal events, entertainment, etc Different characteris.cs of each type Related problems •  Retrieval of events e.g. find all music events and associated photos that took place in Canada in 2014 •  Classify events and associated photos to event types
  26. 26. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Real world events monitoring (1) But… •  many non-event messages, photos, etc •  the huge number of messages, makes it very challenging for interested users to monitor the evolu.on of the event •  many messages can be considered as spam or non- informa2ve •  In case of mul.media: internet memes, screenshots, images of low quality… •  Redundancy due to near duplicate messages and images
  27. 27. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Real world events monitoring (2) #nbafinals Irrelevant Duplicates with no explicit associa2on Non-informa2ve
  28. 28. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Event related collec$on is available Visual Event Summariza2on Visual Event Summariza2on is the problem of selec.ng a concise set of images that are highly relevant to the event and contain visually, the key aspects of the event. Event-based Visual Summarizer List of all event images Set of Selected Representa2ve and Diverse Images
  29. 29. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Incremental Event Detec2on and Summariza2on •  New messages and photos available every single moment –  2 million new images per day in Flickr (2015) –  500 million new tweets per day (2016) •  Track evolu.on of events –  Events emerge, evolve, disappear •  Detect events incrementally e.g. per hour, day –  Use of a sliding .me window → Detect events per .me window –  Linkage techniques to associate events from successive .me windows •  Updated summariza.on –  Summarize to no.fy users only for new informa.on for an event –  Summarize per .me window given what the user has already seen in the previous ones
  30. 30. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Social Event Detec2on (SED) SED challenge @ MediaEval workshop
  31. 31. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on SED @ MediaEval Workshop Year Challenge Dataset 2011 Find events related to two categories: (a) soccer matches in Barcelona & Rome, (b) concerts in Paradiso & Parc del Forum 73,645 Flickr photos from Five cities, May 2009 2012 Find events related to three categories: (a) technical events (e.g. exhibitions) in Germany, (b) soccer events in Hamburg and Madrid, (c) Indignados movement events in Madrid 167,332 Flickr photos from five cities, 2009-2011 2013 (a) Cluster photo collections into events, (b) attach YouTube videos to the discovered events 437,370 Flickr photos around upcoming or Last.fm events, 2006-2012, and 1,327 YouTube videos around the events defined by the photos Categorize photos into eight event types or non- event: concerts, conferences, exhibitions, fashions shows, sports, protests, theatrical/dance events, other. 2014 (a) Cluster photo collections into events, (b) attach YouTube videos to the discovered events 367,578 Flickr photos clustered in 17,834 social events, 110,541 unclustered photos. Retrieve events according to specific search criteria e.g. location, event type, involved entities, etc
  32. 32. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on SED using “Same Event” model (1) Mul.modal “Same-Event” Model •  Adopt an item-item approach (message to message) •  Represent messages using k features/modali.es –  Textual content (H·idf), visual content (VLAD+SURF), temporal informa.on, contributor, loca.on, etc •  Calculate a distance vector v(i,j) between messages i, j, based on each modality –  Different distance func$ons per modality e.g. cosine for text, harvesing distance for loca$on, etc •  Predict (e.g. using SVM) whether two messages mi and mj belong to the same event according to the calculate distance vector v(i,j)
  33. 33. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on SED using “Same Event” model (2)
  34. 34. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Large-scale graph-based clustering •  Problem: Discover structure in large-scale datasets by exploi.ng their rela.ons •  Challenges - Approach: –  Large-scale –  Fast response .mes –  Efficient memory usage –  Noise Resilient –  Number of clusters not known •  Structural similarity + local expansion community detec.on techniques
  35. 35. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on •  Structural similarity + Local expansion (highly efficient and scalable approach) •  Not necessary to know the number of clusters •  Noise resilient (not all nodes need to be part of a community) •  Generic approach adaptable to many applications (depending on node – edge representation) + S. Papadopoulos, Y. Kompatsiaris, A. Vakali. “A Graph-based Clustering Scheme for Identifying Related Tags in Folksonomies”. In Proceedings of DaWaK'10, Springer-Verlag, 65-76 Large-scale graph-based clustering
  36. 36. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Pre-processing / Filtering Text-based filtering •  heuris.c rules for spam filtering → discard very short messages & messages with many men.ons, URLs or hashtags. •  filtering of unstructured messages using POS tagging Accept → (determiner? adjec$ve* noun+ verb)+ Visual-based filtering of messages with mul2media content •  discard small images, images of low quality, etc •  detect and discard memes, screenshots and images containing heavy text
  37. 37. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Pre-processing / Filtering Text-based filtering Visual-based filtering Tweet length POS tagging filtering
  38. 38. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on “Same Event” Graph Crea2on GSE •  For each message (image) get candidate messages (images) •  Calculate “Same Event” score only for candidates sub-list •  Add edges for pairs with high score (thresholding) •  Messages from the same event form dense sub-graphs
  39. 39. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Event Detec2on •  Apply Structural Clustering Algorithm for Networks (SCAN) → iden.fy dense sub-graphs of messages in GSE •  Sub-graphs represent the events that exist in the stream of messages •  A substan.al amount of messages is kept outside of the detected clusters: Hubs & Outliers Events Hub
  40. 40. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Post Processing •  Assign un-clustered images to events –  Hubs: Adjacent to mul.ple communi.es → Assign to the community with more connec.ons if this number exceeds a threshold Tdeg –  Outliers: Isolated messages in the graph → Either form single item events or discard them •  Use classifica.on techniques to detect event types for each detected event •  Calculate a representa.on for each detected event –  Find representa.ve .tle, dura.on, loca.on, etc
  41. 41. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results BeWer results in 2014 challenge but the same approach •  Fine tuning of thresholding during graph crea.on •  Advanced technique the for selec2on of nega2ve/posi2ve pairs in SEM training •  CNN-based visual features for images
  42. 42. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Visual Event Summariza2on on Social Media using Topic Modelling and Graph-based Ranking Algorithms
  43. 43. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on MGraph: Framework Overview 1.  create message mul.-graph using textual, visual and temporal proximity 2.  find underlying topics using SCAN algorithm 3.  calculate prior scores of images based on topics and popularity (relevance) 4.  diversify using DivRank
  44. 44. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Mul2-graph Genera2on (1) Given a set of (original) messages M={m1, m2, ..., mn} we construct a mul.-graph GM = {V, Etextual, Evisual, Esocial, E2me} •  vertex vi ∈ V corresponds to message mi •  Etextual → undirected edges expressing the textual similarity (cosine similarity) between nodes (N·idf vector vm) •  Evisual → undirected edges that represent the visual similarity (L2 distance) between nodes with images (VLAD+SURF vectors) Thresholding: add an edge in Etextual or Evisual, only if the textual or visual similarity between the corresponding nodes is higher than thtextual or thvisual respec.vely
  45. 45. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Mul2-graph Genera2on (2)
  46. 46. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Example mul2-modal sub-graph #
  47. 47. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Visual deduplica2on •  Visual duplicates for which there is no explicit connec.on → apply Clique Percola.on Method (CPM) on sub-graph Gvisual = {V, Evisual} •  Represent detected cliques as single messages: –  VLAD aggrega.on on SURF descriptors of all images in the clique –  mean value of publica.on .me –  aggregated value of reposts of each message. –  merged I·idf vector •  Replace clustered messages in GM with cliques and re-calculate the corresponding edges
  48. 48. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Visual deduplica2on GM Gvisual
  49. 49. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Message Selec2on Score Reposts (retweets) relevance x cluster size x specificity
  50. 50. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Specificity High specificity Low specificity rare across all topics of the event common across topics
  51. 51. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Image Ranking & Diversifica2on variant of PageRank aiming diversity
  52. 52. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Dataset and Event Descrip2on •  dataset of McMinn et al. having more than 500 events from different domains •  we used the 50 largest events in terms of tweets •  sports events (e.g., the Sochi winter Olympics), poli.cal events (Ukraine crisis, Venezuelan protests), disasters, etc. •  364,005 tweets, on average 4,730 tweets/event •  296,160 remaining tweets, due to suspended accounts and deleted messages •  about 3,51% of these, i.e. 12,772 tweets, contain an embedded image
  53. 53. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Relevance Judgments Each image is shown to 3 par.cipants (20 img-20 part) without ranking informa.on Task Descrip2on: You are presented with an image and an event .tle describing a trending topic in TwiWer. For each image and event .tle, you are asked to answer the following ques.on: Is this image relevant to the event? 1.  The image is clearly not relevant to the event. 2.  The image is probably not relevant to the event, but I am not en.rely sure. 3.  The image is somewhat relevant to the event, but I have my doubts on whether I would like to see it in a photo coverage of the event. 4.  The image is clearly relevant to the event, and I would like to see it in a photo coverage of the event.
  54. 54. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Experimental Sexng •  VLAD+SURF extrac.on –  64–dimensional SURF descriptors –  four codebooks of 128 visual words (in total 512) to quan.ze each descriptor –  aggregate SURF descriptors into a single vector of 64*512 = 32.768 dimensions using VLAD scheme –  PCA to create a 1024-dimensional L2-normalized reduced vector that represents the visual content of the image •  Mul.-graph genera.on –  k = 500 nearest neighbors –  visual and textual similarity thresholds were set to 0.5 and 0.6 –  σ2 of the temporal kernel was empirically set to 24 hours •  SCAN parameters were set to μ=2 and ε=0.65 •  DivRank’s dumping factor was set to d=0.75
  55. 55. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Evalua2on metrics (1) Precision-oriented metrics •  Precision (P@N): The percentage of images among the top N that are relevant (answers 3&4) to the corresponding event, averaged among all events. We calculate precision for N equal to 1, 5, and 10. •  Success (S@N): Percentage of events, where there exist at least one relevant image among the top N returned, for N=10. •  Mean Reciprocal Rank (MRR) : Computed as 1/r, where r is the rank of the first relevant image returned, averaged over all events.
  56. 56. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Evalua2on metrics (2) Diversity-oriented metrics •  α-normalized Discounted Cumula2ve Gain : α-nDCG@N measures the usefulness, or gain, of the returned images based on their posi.on in the summary (N=10). •  Average Visual Similarity: AVS@N measures the average visual similarity among all pairs of images in the top N selected images, averaged over all events. Lower AVS values are preferable since they imply higher diversity in terms of visual content.
  57. 57. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Baselines •  Random: randomly selects N images from the filtered set of images as the summary set •  MostPopular: picks up the N most popular images in terms of reposts •  LexRank: uses items graph GM, ranks the nodes using the LexRank and selects the top N nodes that contain images •  TopicBased: selects the N most relevant messages from the most significant topics (S_cov) (relevance, no specificity & diversity) •  P-TWR: ranks images in descending order using the weigh.ng scheme described in McParlane et al. (popularity) •  S-TWR: groups the tweets of each event into sub-clusters and select the highest ranked item of each cluster using the previous weigh.ng scheme (specificity)
  58. 58. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results (1) – Precision oriented metrics 58 •  MGraph outperforms all of the compe.ng methods •  Popularity-based approach performs well for P@1 but drops significantly for N=5,10 •  LexRank and TopicBased approaches achieve lower but more steady results First relevant in posi.ons 1 - 2
  59. 59. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results: Canada Team in #Sochi Popularity-based S-TWR MGraph
  60. 60. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results (2) – Diversity oriented metrics •  MGraph achieves the best score for α-nDCG@10 •  Best values of AVS achieved by S-TWR •  The worst results in terms of AVS are obtained using LexRank
  61. 61. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results (3) Performance of MGraph across different categories •  Best P@10 measure is obtained for events about Science & Technology •  The second best P@10 is obtained for events about Arts & Entertainment •  Difficult to diversify •  The best value of AVS is achieved for events about disasters & accidents e.g., earthquakes
  62. 62. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results (4) Impact of the dumping factor d on P@10, S@5, MRR and α-nDCG@10 •  The worst results for all metrics are obtained for d=0 (no re-ranking) •  The best results are achieved for 0.7<d<0.8 •  slight decrease for d>0.8 •  more diverse → less relevant
  63. 63. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Incremental Large-Scale Event Summariza2on
  64. 64. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Yahoo-Flickr Event Summariza2on Task -  Yahoo-Flickr Crea.ve Commons Dataset -  99m images, 1m videos -  Detect events and summarize each detected event -  Open issue: how to evaluate? Graph-based Event Detec2on Summariza2on framework
  65. 65. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Yahoo-Flickr Event Summariza2on Task •  Incremental approach: Use a sliding .me window –  Update “Same Event” graph with new images and discard the old ones •  Detect events in a .meslot basis –  Merge events in successive .meslots using structural overlap
  66. 66. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Results Use H·idf retrieval schema to get events relevant to specific topics e.g. Olympics
  67. 67. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on 67 Search in detected events for conferences
  68. 68. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on
  69. 69. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Technology Stack Storage & Indexing •  MongoDB → storage of social media items •  SOLR → text indexing and retrieval of a) social media items, b) detected events Visual Indexing •  For visual features extrac.on and indexing of yfcc100m dataset → Elas2c Map-Reduce (EMR) service of Amazon Web Service (AWS) •  Berkeley DB for index structure (but any other key-value store can be considered e.g. Redis) Graph Handling •  In memory storage → graph DBs (neo4j) as future work Processing •  Storm for distributed stream processing (focused crawling, indexing etc) 69
  70. 70. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on References Same Event Model •  Petkos, Georgios, et al. "Graph-based mul.modal clustering for social event detec.on in large collec.ons of images." Interna$onal Conference on Mul$media Modeling. Springer Interna.onal Publishing, 2014. Summariza2on •  Schinas, Manos, et al. "Visual event summariza.on on social media using topic modelling and graph-based ranking algorithms." Proceedings of the 5th ACM on Interna$onal Conference on Mul$media Retrieval. ACM, 2015. •  Schinas, Manos, et al. "Mul.modal graph-based event detec.on and summariza.on in social media streams." Proceedings of the 23rd ACM interna$onal conference on Mul$media. ACM, 2015. MediaEval Social Event Detec2on •  Petkos, Georgios, et al. "Social event detec.on at MediaEval: a three-year retrospect of tasks and results." ICMR 2014 Workshop on Social Events in Web Mul$media (SEWM). 2014. •  Riga, Marina, et al. "CERTH@ MediaEval 2014 Social Event Detec.on Task." MediaEval. 2014.
  71. 71. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Contribu2ons •  Dr. Symeon Papadopoulos –  Social network analysis, community detec.on, social media content mining and mul.media indexing and retrieval –  hWp://mklab.i..gr/people/papadop –  TwiWer: @sympap •  Manos Schinas –  Event detec.on in social media –  manosetro@i..gr 71
  72. 72. 11th Interna*onal Workshop on Seman*c and Social Media Adapta*on and Personaliza*on (SMAP 2016) Graph-Based Event Detec*on Conclusion •  Social media data useful in many applica.ons –  Challenge is to go from confirming exis.ng and known correla.ons to predic.on and decision-making •  Many other challenges exist –  Data availability (infrastructure, policies) –  Verifica.on –  Personal data value (legal, ethical) –  Discrimina.on and bias –  Real-.me and scalable approaches –  Fusion of various modali.es (Content, social, temporal, loca.on) •  Events –  Mul.modal and graph-based helps –  Evalua.on is an open issue –  Event predic.on is an ongoing challenge 72
  73. 73. Thank you for your aWen.on! ikom@i..gr hWp://mklab.i..gr

×