Using Web Archives to Enrich the Live Web Experience Through Storytelling

1,739 views
1,682 views

Published on

This is my research in doctoral consortium JCDL2013

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,739
On SlideShare
0
From Embeds
0
Number of Embeds
705
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using Web Archives to Enrich the Live Web Experience Through Storytelling

  1. 1. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Computer Science Department Old Dominion University, Norfolk, VA yasmin@cs.odu.edu
  2. 2. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Once upon a time Every story is made up of a sequence of events 2 Events are exemplified through corresponding web pages from the live web and web archives, (semi-)automatically discovered, arranged in a narrative structure ordered by time, and replayed through an appropriate visualization interface.
  3. 3. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Do you recall the relevant dates, people, and web pages to replay the story of the Egyptian Revolution? 3 The story of the Egyptian Revolution
  4. 4. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany MOTIVATION • There are three information needs: – Overview – Recency – Replaying the story 4
  5. 5. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany5
  6. 6. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany6
  7. 7. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany7
  8. 8. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany8
  9. 9. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Replaying the story 9
  10. 10. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Replaying the story Not yet addressed 10
  11. 11. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Here's my hand-crafted story about the Egyptian Revolution... 11
  12. 12. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany12 02/19/2010
  13. 13. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany13 06/10/2010
  14. 14. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany14 06/25/2010
  15. 15. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany15 07/23/2010
  16. 16. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany16 http://web.archive.org/web/20110707060758 /http://www.thedailybeast.com/articles/2011/ 01/22/we-are-all-khaled-said-will-the- revolution-come-to-egypt.html 01/22/2011
  17. 17. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany17 01/25/2011
  18. 18. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany s 18 01/28/2011
  19. 19. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany19 01/28/2011
  20. 20. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany20 01/28/2011
  21. 21. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany21 01/29/2011
  22. 22. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany22 01/31/2011
  23. 23. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany23 01/31/2011
  24. 24. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany24 02/01/2011
  25. 25. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany25 02/02/2011
  26. 26. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany26 02/10/2011
  27. 27. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany27 02/10/2011
  28. 28. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany28 02/11/2011
  29. 29. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Replaying the past can be more compelling more than a summary 29
  30. 30. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany What about other stories? 30 Hurricane KatrinaBoston Marathon bombings
  31. 31. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Current Capabilities 31
  32. 32. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Google provides summary and recency of information 32
  33. 33. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Adding date does not solve the problem 33
  34. 34. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany How do we know to include this page? 34
  35. 35. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany The current status is not relevant to the purpose of the page 35
  36. 36. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany News websites are hard to search 36
  37. 37. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany News websites are hard to search 37
  38. 38. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany News websites are hard to search 38
  39. 39. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany News websites are hard to search 39
  40. 40. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany40 Egyptian Revolution on Storify
  41. 41. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany41 Bookmarking, not preserving!
  42. 42. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Archive-It collections 42
  43. 43. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Seed URIs are gathered by people 43
  44. 44. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany44 Seed URIs are gathered by people
  45. 45. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany People gather ad hoc seed URIs 45
  46. 46. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany46
  47. 47. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany47
  48. 48. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany48
  49. 49. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany49
  50. 50. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Research Question Based on the terminology introduced earlier: How do we define the time frame of a story, the individual events that make up a story, identify, evaluate, and select from candidate (archived) web pages to support the events, and visualize the result? 50
  51. 51. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Our Preliminary Work 51
  52. 52. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany JCDL 2012: Visualizations for Archive-It Collections 52 • Archive-It collections are list view – Hard to explore
  53. 53. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany JCDL 2012: Visualizations for Archive-It Collections 53 • Archive-It collections are list view – Hard to explore • We introduced a variety of visualizations for understanding the collections
  54. 54. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany JCDL 2013: How do people access web archives? • We identified four major web archive access patterns: Dip, Slide, Dive, Skim • The median length of the patterns is 3, which is very short! – We can not rely on web archive logs to create seed URIs for stories • People do not know about web archives – Robots outnumber humans 10:1 in terms of sessions 54
  55. 55. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany TPDL 2013: Who and What Links to the Internet Archive • Reaching web archives is not easy – 82% of human sessions connect to the Wayback Machine via referrals • External referrers link to the web archives because they can't find the web pages on the live web – 83% of all the pages that have links from outside the archive do not currently exist on the live web • People come to web archives because they can’t find the web pages on the live web – 65% of the requested archived pages no longer exist on the live web 55
  56. 56. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Dissertation Main Steps 56
  57. 57. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany How do we automatically construct this?var mementos = [{ … "Title": "Egypt braces for anti-government protests", "Image": "im001.png", "Snippet": "Egypt's authoritarian government is bracing itself …", "Memento-Datetime": "25 January 2011 2:47 a.m.EST“ "URI": " http://web.archive.org/web/20110125024787/http://www.cnn.com/ ... “ }, { "Title": "Will Egypt follow Tunisia's lead?", "Image": "im002.png", "Snippet": " Demonstrators protest in central Cairo, Egypt, on Tuesday …", "Memento-Datetime": "25 January 2011 3:00 p.m.EST" "URI": " http://web.archive.org/web/20110125030042/http://www.cnn.com/ ... " }, { "Title": "Obama to Mubarak: Deliver on vows", "Image": "im002.png", "Snippet": "President Obama spoke with Egypt's president moments after …", "Memento-Datetime": "28 January 2011 8:56 p.m.EST“ "URI": "http://web.archive.org/web/20112801085658/http://www.cnn.com/...” “ }, { "Title": "Without police, Cairo 'like Wild West", "Image": "im002.png", "Snippet": " With local police effectively no longer on the ground in …", "Memento-Datetime": "29 January 2011 3:00 p.m.EST“ "URI": "http://web.archive.org/web/20110129030000/http://www.cnn.com/ ... “} … ]; 57
  58. 58. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Usage Scenario 58 I want to replay the story of Egyptian Revolution, but I don’t remember its date
  59. 59. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Lori has her search box for replaying the story 59
  60. 60. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 1: Calculating the datetime of the story dynamically 60 • Problem – People remember the name of the story, but not the date – Some stories have no specific time. History is not static! • Methods: – Use the time in Wikipedia infoboxes Hoffart et al. 2012 (may be not applicable for all stories) – Extract temporal expressions from unstructured text using time and event recognition algorithms Kanhabua et al. 2012 – Look up in news sites, such as wikinews – Investigate different techniques for finding the time range of the story • Evaluation – Use gold standard dataset of stories that have specified datetimes
  61. 61. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 2: Get the URI seeds for the story • Problem – Specifying the top K related pages automatically • Methods – Use the list of references on Wikipedia page and Check the older versions of the page for URIs – Use Google’s API to search for the term of the story – Query the archives full text search (such as Archive-It and UK Web Archive) – Use Social media web sites (such as storify, twitter, and topsy) – Determine information retrieval techniques for finding relevant links from the seeds. • Evaluation – Test the aboutness of the web pages • Check the anchor text and its relation to the content. Furthermore, examine the existence of the URI in the live web and its aboutness by looking to the page content in the web archives – Check the coverage of Google search against the coverage of the links from social media and from the web archives – Contrast the cost with the quality (Google may have bias, but it may be faster) 61
  62. 62. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 2: Different sources give us different URI seeds 62
  63. 63. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 3: Determine datetimes of web pages • Problem – There are many notions of time for a web page (creation date, modification date, archiving date, etc.) – Hard to know the creation date • Methods – Compute and assign those notions of time to each web page • Estimating the date of a web page by looking at the pages that link to it • Carbon dating SalahEldeenet al. 2013 • Evaluation – Use gold standard dataset of articles that have clear timestamps to be extracted 63
  64. 64. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 3: This page was archived two months after the start of the Revolution 64
  65. 65. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 4: Suppose we got 10,000 related pages for each event of the story 65 I’ll add here many copies from bbc, nytimes, foxnews
  66. 66. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 4: Choose the best candidates for each event of the story • Problem – Specify the best high quality pages that express the story • Methods – Eliminate the duplicates – There are multiple dimensions of quality: • Web-based structural quality, such as page rank • Quality of replaying the archived page • Evaluation – Choose small number of candidates for each event – Show all candidates and ask Mechanical Turkers if the candidates are good or not 66
  67. 67. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 4: Example of duplication 67
  68. 68. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 4: The same event from two different news sites 68
  69. 69. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 4: Quality of archived pages 69
  70. 70. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 5: Visualize the results Interactive Timeline 70 ReplayingStory of Egyptian Revolution
  71. 71. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 5: Visualize the results Slideshow • Different View 71
  72. 72. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 5: Visualize the results using Storify 72
  73. 73. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 5: Visualize the results 73 • Problem – From the chosen pages, how to create a human-readable story and visualize it? • Methods – Provide different interactive visualizations that enable exploring the story easily, such as • Timeline Visualization • Slideshow Visualization – Provide the user with the ability to modify the story and specify the start and end dates • Evaluation – Solicit feedback from humanities researchers
  74. 74. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Step 6: Make the created story accessible for others • Problem – People do not know about the created story – Discovering links between stories • Method – Allow sharing of the story with others – Allow feedback and updating of the story by others 74
  75. 75. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Dissertation Timeline 75 Year1 Year 2 Year 3 Year 4 Year 5 Background Research Course Work Qualifying Exam Proposal writing Candidacy Framework Implementation Calculate the datetime of the story dynamically Get the seeds of URIs Determine the datetimes of web pages Choose the best candidates for each event Visualize the results Evaluate the framework Make the created story accessible for others Dissertation Writing Ph.D. Defense Completed To be completed Not started Oct. 2012 Nov. 2013 Dec. 2015
  76. 76. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Extra Slides 76
  77. 77. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany User Access patterns in Web archives 77
  78. 78. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Everybody Dips, Humans Dive, Robots Skim 78 Robots (34,203 sessions) Humans (3,431 sessions)
  79. 79. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Prior Work User Access Patterns in Web Archives 79
  80. 80. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany 86% of the links to mementos 80
  81. 81. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany Using Wikipedia infobox for detecting the time of the event 81
  82. 82. Using Web Archives to Enrich the Live Web Experience Through Storytelling Yasmin AlNoamany The Last modified date is months away from archiving date 82 http://web.archive.org/web/20110707060758 /http://www.thedailybeast.com/articles/2011/ 01/22/we-are-all-khaled-said-will-the- revolution-come-to-egypt.html

×