ECIR 2013 Keynote - Time for Events

2,772 views
2,612 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,772
On SlideShare
0
From Embeds
0
Number of Embeds
843
Actions
Shares
0
Downloads
25
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

ECIR 2013 Keynote - Time for Events

  1. 1. time for eventstelling the world’s stories from social media Mor Naaman Rutgers SC&I & Mahaya, Inc. @informor
  2. 2. enter: social media
  3. 3. (JCDL 2007)
  4. 4. (JCDL 2007)
  5. 5. (SIGIR 2007) yes.
  6. 6. organize the world’s memories
  7. 7. people, together
  8. 8. BYOBW
  9. 9. outside lands festival
  10. 10. organize the world’s memories
  11. 11. detectidentifyorganize objectives
  12. 12. objectivesdetect ICWSM 2011a JASIST 2011 WebDB 2009 SIGIR 2007
  13. 13. objectivesidentify WSDM 2012 ICWSM 2011b WSDM 2010
  14. 14. objectives organize ICMR 2012 CHI 2012CSCW 2012 MTAP 2012 VAST 2010WWW 2009
  15. 15. today organize identifydetect Vox! multi-site Multiplayer
  16. 16. overview E Multi-site content Vox Civitas Multiplayer
  17. 17. Egoaleffectively retrieve social media contentfor known events from multiple services[with Hila Becker, Luis Gravano]
  18. 18. E
  19. 19. Echallengesevent descriptor not well-formedbrief textual descriptorsnoiseformats/conventions/metadata differ
  20. 20. Eapproachtwo-step query formulation precision-based recall-basedvalidate queries based on known/extracted event model
  21. 21. E Estep 1term extraction from event descriptorsgenerates “high precision” queriese. g. “andrew bird, opening gala,celebrate brooklyn, prospect park”
  22. 22. E Estep 2use “high precision” corpus to generatemore general queries to improve recalle. g. “andrew bird concert”, “state farminsurance”
  23. 23. E Erecall-oriented queriesBenefits:- Works cross-site- Works with short contentChallenges:- Introduces noise- Potentially large set of queries
  24. 24. E Epost-filteringuse known event model (topics, time,location)use queries with a result set thatmatches known model
  25. 25. E Efor example...120"100" 80" 60" 40" 20" 0" 6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11" [andrew"bird"concert]" [state"farm"insurance]"
  26. 26. Eevaluation 1.1"query generation4" 1" 4" 0.9" 0.8" 5" 5" Precision" 0.7"relevance of36"retrieved documentsNDCG% 0.6" 39" 34" 34" Twi7er8MS" 0.5" 0.4" 0.3" YouTube8MS" 7" 0.2" 9" 8" 8" 0.1" 0" 0" 5" 10" 15" 20" 25" Number%of%Documents%k%
  27. 27. Etakeawayscan aggregate content fragmentedacross platformsimprove recall, not rely on site-specificfeatures
  28. 28. overview E Multi-site content (WSDM 2012) Vox Civitas Multiplayer
  29. 29. research questionscan Twitter content around broadcastnews events inform journalistic inquiry?what insights and analyses can weenable through visual analytic tools?[with postdoctoral fellow Nick Diakopoulos]
  30. 30. supporting analysisdirect attention to relevant informationautomatic content analysis for filtering – relevance – uniqueness / novelty – sentiment – keyword extraction
  31. 31. how to evaluate?directly evaluate the output of thealgorithms (quantitative)deep, extensive evaluation of users’interaction with the system (qualitative)   read more: Olsen (UIST ’07) Naaman (MTAP ’12)
  32. 32. Vox evaluation goals•  How effective for generating story ideas?•  What kind of insights/analysis are supported?•  Shortcomings and how features are used?
  33. 33. takeawayscan extract reliable event structure fromsocial media
  34. 34. overview E Multi-site content Vox Civitas (VAST 2010) Multiplayer
  35. 35. what the hell?[with: Lyndon Kennedy, Dan Ellis, Kai Su]
  36. 36. supporting analysisextract the signal from people’sattention:find overlapping momentscompute and rank scenesextract scene descriptors
  37. 37. audio fingerprinting Wang et al. (ISMIR ’03)
  38. 38. two clips, aligned 0:18 3:320:000:00 2:32
  39. 39. a story of n clips time
  40. 40. from clips to scenesHigher GroundEncore time Happy Birthday, Birthday
  41. 41. evaluationquantitative: evaluated matching, sceneextraction…qualitative: evaluated deploymentscenario/task
  42. 42. takeawayscan create an event presentation thatgets better them more content is added
  43. 43. overview E Multi-site content Vox Civitas Multiplayer (NM&S 2012, ICMR 2012, MTAP 2012, WWW 2009)
  44. 44. towards better models oflarge-scale human attention
  45. 45. printing press
  46. 46. è knowledge archive
  47. 47. digital documents
  48. 48. èdigital archive
  49. 49. the web
  50. 50. ènetworked archive
  51. 51. social media
  52. 52. èexperience archive
  53. 53. new methods?
  54. 54. search by subject code?
  55. 55. explore.new information seeking tasks (andmodels)new applications for social mediacontent
  56. 56. explore.beyond real-timepersonal and social
  57. 57. questions? mor@rutgers.edu @informorhttp://mornaaman.com
  58. 58. thanksLuis GravanoHila BeckerNick DiakopoulosKai SuDan EllisMunmun de ChoudhuryTarikh Korula…

×