time for eventstelling the world’s stories from social media                              Mor Naaman                Rutger...
enter: social media
(JCDL 2007)
(JCDL 2007)
(SIGIR 2007)               yes.
organize the world’s memories
people, together
BYOBW
outside lands festival
organize the world’s memories
detectidentifyorganize           objectives
objectivesdetect         ICWSM 2011a         JASIST 2011         WebDB 2009         SIGIR 2007
objectivesidentify             WSDM 2012             ICWSM 2011b             WSDM 2010
objectives               organize ICMR 2012   CHI 2012CSCW 2012 MTAP 2012 VAST 2010WWW 2009
today                                  organize         identifydetect                                            Vox!    ...
overview   E   Multi-site content               Vox Civitas               Multiplayer
Egoaleffectively retrieve social media contentfor known events from multiple services[with Hila Becker, Luis Gravano]
E
Echallengesevent descriptor not well-formedbrief textual descriptorsnoiseformats/conventions/metadata differ
Eapproachtwo-step query formulation  precision-based  recall-basedvalidate queries based on known/extracted event model
E                                 Estep 1term extraction from event descriptorsgenerates “high precision” queriese. g. “an...
E                                  Estep 2use “high precision” corpus to generatemore general queries to improve recalle. ...
E                                     Erecall-oriented queriesBenefits:- Works cross-site- Works with short contentChallen...
E                                     Epost-filteringuse known event model (topics, time,location)use queries with a resul...
E                                                                    Efor example...120"100" 80" 60" 40" 20"  0"       6/7...
Eevaluation        1.1"query generation4"          1"                                    4"        0.9"        0.8"       ...
Etakeawayscan aggregate content fragmentedacross platformsimprove recall, not rely on site-specificfeatures
overview   E   Multi-site content                  (WSDM 2012)               Vox Civitas               Multiplayer
research questionscan Twitter content around broadcastnews events inform journalistic inquiry?what insights and analyses c...
supporting analysisdirect attention to relevant informationautomatic content analysis for filtering   – relevance   – uniq...
how to evaluate?directly evaluate the output of thealgorithms (quantitative)deep, extensive evaluation of users’interactio...
Vox evaluation goals•  How effective for generating story ideas?•  What kind of insights/analysis are   supported?•  Short...
takeawayscan extract reliable event structure fromsocial media
overview   E   Multi-site content               Vox Civitas                 (VAST 2010)               Multiplayer
what the hell?[with: Lyndon Kennedy, Dan Ellis, Kai Su]
supporting analysisextract the signal from people’sattention:find overlapping momentscompute and rank scenesextract scene ...
audio fingerprinting             Wang et al. (ISMIR ’03)
two clips, aligned         0:18                 3:320:000:00                       2:32
a story of n clips time
from clips to scenesHigher GroundEncore         time           Happy Birthday,                        Birthday
evaluationquantitative: evaluated matching, sceneextraction…qualitative: evaluated deploymentscenario/task
takeawayscan create an event presentation thatgets better them more content is added
overview   E   Multi-site content               Vox Civitas               Multiplayer                 (NM&S 2012, ICMR 201...
towards better models oflarge-scale human attention
printing press
è knowledge archive
digital documents
èdigital archive
the web
ènetworked archive
social media
èexperience archive
new methods?
search by subject code?
explore.new information seeking tasks (andmodels)new applications for social mediacontent
explore.beyond real-timepersonal and social
questions?   mor@rutgers.edu       @informorhttp://mornaaman.com
thanksLuis GravanoHila BeckerNick DiakopoulosKai SuDan EllisMunmun de ChoudhuryTarikh Korula…
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
ECIR 2013 Keynote - Time for Events
Upcoming SlideShare
Loading in...5
×

ECIR 2013 Keynote - Time for Events

2,327

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,327
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

ECIR 2013 Keynote - Time for Events

  1. 1. time for eventstelling the world’s stories from social media Mor Naaman Rutgers SC&I & Mahaya, Inc. @informor
  2. 2. enter: social media
  3. 3. (JCDL 2007)
  4. 4. (JCDL 2007)
  5. 5. (SIGIR 2007) yes.
  6. 6. organize the world’s memories
  7. 7. people, together
  8. 8. BYOBW
  9. 9. outside lands festival
  10. 10. organize the world’s memories
  11. 11. detectidentifyorganize objectives
  12. 12. objectivesdetect ICWSM 2011a JASIST 2011 WebDB 2009 SIGIR 2007
  13. 13. objectivesidentify WSDM 2012 ICWSM 2011b WSDM 2010
  14. 14. objectives organize ICMR 2012 CHI 2012CSCW 2012 MTAP 2012 VAST 2010WWW 2009
  15. 15. today organize identifydetect Vox! multi-site Multiplayer
  16. 16. overview E Multi-site content Vox Civitas Multiplayer
  17. 17. Egoaleffectively retrieve social media contentfor known events from multiple services[with Hila Becker, Luis Gravano]
  18. 18. E
  19. 19. Echallengesevent descriptor not well-formedbrief textual descriptorsnoiseformats/conventions/metadata differ
  20. 20. Eapproachtwo-step query formulation precision-based recall-basedvalidate queries based on known/extracted event model
  21. 21. E Estep 1term extraction from event descriptorsgenerates “high precision” queriese. g. “andrew bird, opening gala,celebrate brooklyn, prospect park”
  22. 22. E Estep 2use “high precision” corpus to generatemore general queries to improve recalle. g. “andrew bird concert”, “state farminsurance”
  23. 23. E Erecall-oriented queriesBenefits:- Works cross-site- Works with short contentChallenges:- Introduces noise- Potentially large set of queries
  24. 24. E Epost-filteringuse known event model (topics, time,location)use queries with a result set thatmatches known model
  25. 25. E Efor example...120"100" 80" 60" 40" 20" 0" 6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11" [andrew"bird"concert]" [state"farm"insurance]"
  26. 26. Eevaluation 1.1"query generation4" 1" 4" 0.9" 0.8" 5" 5" Precision" 0.7"relevance of36"retrieved documentsNDCG% 0.6" 39" 34" 34" Twi7er8MS" 0.5" 0.4" 0.3" YouTube8MS" 7" 0.2" 9" 8" 8" 0.1" 0" 0" 5" 10" 15" 20" 25" Number%of%Documents%k%
  27. 27. Etakeawayscan aggregate content fragmentedacross platformsimprove recall, not rely on site-specificfeatures
  28. 28. overview E Multi-site content (WSDM 2012) Vox Civitas Multiplayer
  29. 29. research questionscan Twitter content around broadcastnews events inform journalistic inquiry?what insights and analyses can weenable through visual analytic tools?[with postdoctoral fellow Nick Diakopoulos]
  30. 30. supporting analysisdirect attention to relevant informationautomatic content analysis for filtering – relevance – uniqueness / novelty – sentiment – keyword extraction
  31. 31. how to evaluate?directly evaluate the output of thealgorithms (quantitative)deep, extensive evaluation of users’interaction with the system (qualitative)   read more: Olsen (UIST ’07) Naaman (MTAP ’12)
  32. 32. Vox evaluation goals•  How effective for generating story ideas?•  What kind of insights/analysis are supported?•  Shortcomings and how features are used?
  33. 33. takeawayscan extract reliable event structure fromsocial media
  34. 34. overview E Multi-site content Vox Civitas (VAST 2010) Multiplayer
  35. 35. what the hell?[with: Lyndon Kennedy, Dan Ellis, Kai Su]
  36. 36. supporting analysisextract the signal from people’sattention:find overlapping momentscompute and rank scenesextract scene descriptors
  37. 37. audio fingerprinting Wang et al. (ISMIR ’03)
  38. 38. two clips, aligned 0:18 3:320:000:00 2:32
  39. 39. a story of n clips time
  40. 40. from clips to scenesHigher GroundEncore time Happy Birthday, Birthday
  41. 41. evaluationquantitative: evaluated matching, sceneextraction…qualitative: evaluated deploymentscenario/task
  42. 42. takeawayscan create an event presentation thatgets better them more content is added
  43. 43. overview E Multi-site content Vox Civitas Multiplayer (NM&S 2012, ICMR 2012, MTAP 2012, WWW 2009)
  44. 44. towards better models oflarge-scale human attention
  45. 45. printing press
  46. 46. è knowledge archive
  47. 47. digital documents
  48. 48. èdigital archive
  49. 49. the web
  50. 50. ènetworked archive
  51. 51. social media
  52. 52. èexperience archive
  53. 53. new methods?
  54. 54. search by subject code?
  55. 55. explore.new information seeking tasks (andmodels)new applications for social mediacontent
  56. 56. explore.beyond real-timepersonal and social
  57. 57. questions? mor@rutgers.edu @informorhttp://mornaaman.com
  58. 58. thanksLuis GravanoHila BeckerNick DiakopoulosKai SuDan EllisMunmun de ChoudhuryTarikh Korula…
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×