• Like
ECIR 2013 Keynote - Time for Events
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

ECIR 2013 Keynote - Time for Events

  • 2,163 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,163
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
23
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. time for eventstelling the world’s stories from social media Mor Naaman Rutgers SC&I & Mahaya, Inc. @informor
  • 2. enter: social media
  • 3. (JCDL 2007)
  • 4. (JCDL 2007)
  • 5. (SIGIR 2007) yes.
  • 6. organize the world’s memories
  • 7. people, together
  • 8. BYOBW
  • 9. outside lands festival
  • 10. organize the world’s memories
  • 11. detectidentifyorganize objectives
  • 12. objectivesdetect ICWSM 2011a JASIST 2011 WebDB 2009 SIGIR 2007
  • 13. objectivesidentify WSDM 2012 ICWSM 2011b WSDM 2010
  • 14. objectives organize ICMR 2012 CHI 2012CSCW 2012 MTAP 2012 VAST 2010WWW 2009
  • 15. today organize identifydetect Vox! multi-site Multiplayer
  • 16. overview E Multi-site content Vox Civitas Multiplayer
  • 17. Egoaleffectively retrieve social media contentfor known events from multiple services[with Hila Becker, Luis Gravano]
  • 18. E
  • 19. Echallengesevent descriptor not well-formedbrief textual descriptorsnoiseformats/conventions/metadata differ
  • 20. Eapproachtwo-step query formulation precision-based recall-basedvalidate queries based on known/extracted event model
  • 21. E Estep 1term extraction from event descriptorsgenerates “high precision” queriese. g. “andrew bird, opening gala,celebrate brooklyn, prospect park”
  • 22. E Estep 2use “high precision” corpus to generatemore general queries to improve recalle. g. “andrew bird concert”, “state farminsurance”
  • 23. E Erecall-oriented queriesBenefits:- Works cross-site- Works with short contentChallenges:- Introduces noise- Potentially large set of queries
  • 24. E Epost-filteringuse known event model (topics, time,location)use queries with a result set thatmatches known model
  • 25. E Efor example...120"100" 80" 60" 40" 20" 0" 6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11" [andrew"bird"concert]" [state"farm"insurance]"
  • 26. Eevaluation 1.1"query generation4" 1" 4" 0.9" 0.8" 5" 5" Precision" 0.7"relevance of36"retrieved documentsNDCG% 0.6" 39" 34" 34" Twi7er8MS" 0.5" 0.4" 0.3" YouTube8MS" 7" 0.2" 9" 8" 8" 0.1" 0" 0" 5" 10" 15" 20" 25" Number%of%Documents%k%
  • 27. Etakeawayscan aggregate content fragmentedacross platformsimprove recall, not rely on site-specificfeatures
  • 28. overview E Multi-site content (WSDM 2012) Vox Civitas Multiplayer
  • 29. research questionscan Twitter content around broadcastnews events inform journalistic inquiry?what insights and analyses can weenable through visual analytic tools?[with postdoctoral fellow Nick Diakopoulos]
  • 30. supporting analysisdirect attention to relevant informationautomatic content analysis for filtering – relevance – uniqueness / novelty – sentiment – keyword extraction
  • 31. how to evaluate?directly evaluate the output of thealgorithms (quantitative)deep, extensive evaluation of users’interaction with the system (qualitative)   read more: Olsen (UIST ’07) Naaman (MTAP ’12)
  • 32. Vox evaluation goals•  How effective for generating story ideas?•  What kind of insights/analysis are supported?•  Shortcomings and how features are used?
  • 33. takeawayscan extract reliable event structure fromsocial media
  • 34. overview E Multi-site content Vox Civitas (VAST 2010) Multiplayer
  • 35. what the hell?[with: Lyndon Kennedy, Dan Ellis, Kai Su]
  • 36. supporting analysisextract the signal from people’sattention:find overlapping momentscompute and rank scenesextract scene descriptors
  • 37. audio fingerprinting Wang et al. (ISMIR ’03)
  • 38. two clips, aligned 0:18 3:320:000:00 2:32
  • 39. a story of n clips time
  • 40. from clips to scenesHigher GroundEncore time Happy Birthday, Birthday
  • 41. evaluationquantitative: evaluated matching, sceneextraction…qualitative: evaluated deploymentscenario/task
  • 42. takeawayscan create an event presentation thatgets better them more content is added
  • 43. overview E Multi-site content Vox Civitas Multiplayer (NM&S 2012, ICMR 2012, MTAP 2012, WWW 2009)
  • 44. towards better models oflarge-scale human attention
  • 45. printing press
  • 46. è knowledge archive
  • 47. digital documents
  • 48. èdigital archive
  • 49. the web
  • 50. ènetworked archive
  • 51. social media
  • 52. èexperience archive
  • 53. new methods?
  • 54. search by subject code?
  • 55. explore.new information seeking tasks (andmodels)new applications for social mediacontent
  • 56. explore.beyond real-timepersonal and social
  • 57. questions? mor@rutgers.edu @informorhttp://mornaaman.com
  • 58. thanksLuis GravanoHila BeckerNick DiakopoulosKai SuDan EllisMunmun de ChoudhuryTarikh Korula…