SlideShare a Scribd company logo
1 of 28
Tools for Managing the Past Web 
Michele C. Weigle 
Web Sciences and Digital Libraries (WS-DL) Group 
Department of Computer Science 
Old Dominion University 
Norfolk, VA 
Includes joint work with Michael L. Nelson and our PhD students, Yasmin AlNoamany, Ahmed 
AlSum (PhD 2014), Justin Brunelle, Mat Kelly, Hany SalahEldeen 
Archive-It Partners Meeting 
November 18, 2014
Outline 
Start-Up and 
Implementation Grants 
– WARCreate 
– WAIL 
– Mink 
– Assessing Memento 
Damage 
Web Archiving Incentive 
– Thumbnail 
Summarization 
– Detecting Off-Topic 
Mementos 
WARCreate WAIL Mink 
https://ws-dl.cs.odu.edu/Software 
November 18, 2014 Archive-It Partners Meeting 2
Archive What I See Now 
• Standard web archiving 
tools are difficult for non 
IT experts. 
• "Save Page As" is not 
suitable for archiving 
purposes. 
• Pages are behind 
authentication. 
• Pages change quickly, 
but current state needs 
archiving. 
NEH Digital Humanities Implementation Grant, 2014-2017, http://bit.ly/odu-dhig-2014 
November 18, 2014 Archive-It Partners Meeting 
3
How we're addressing the problem 
Google Chrome extension 
Archive the current state of the 
page in standard Web Archive 
(WARC) format 
Compatible with Wayback 
WARCreate 
Kelly and Weigle, "WARCreate - Create Wayback-Consumable WARC Files from Any Webpage", JCDL 2012 
Kelly, Weigle, and Nelson. "WARCreate - Create Wayback-Consumable WARC Files from Any Webpage," Digital Preservation 2012, Tools Demo Session 
4 
November 18, 2014 Archive-It Partners Meeting
WARCreate - Work in Progress 
• New modes of operation 
– record mode 
• while activated, add capture of each page visited 
to the WARC 
– countdown mode 
• every interval, refresh and add new capture of 
page 
– event mode 
• add new capture of page every time it dynamically 
reloads or refreshes 
November 18, 2014 Archive-It Partners Meeting 5
WARCreate - Work in Progress 
• Uploading created WARCs to Archive-It 
or other archives 
– consideration of data integrity 
– merging local WARCs with crawled WARCs 
• how do we account for your www.facebook.com vs. 
my www.facebook.com? 
– privacy 
November 18, 2014 Archive-It Partners Meeting 6
What to do with created WARCs? 
WAIL 
Load created WARCs into a 
Wayback instance on your local 
computer 
Single-click install of Wayback 
(and other archiving tools) 
Includes IIPC's OpenWayback 2.0 
and Heritrix 3.2 
Available for Windows, OS X 
(Linux coming soon!) 
Kelly, Weigle, and Nelson. "Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving," Personal Digital Archiving 2013, Poster Session 
Kelly, Nelson, and Weigle. "WARCreate and WAIL: WARC, Wayback and Heritrix Made Easy," Digital Preservation 2013 
November 18, 2014 Archive-It Partners Meeting 7
WAIL - Work in Progress 
• More tools 
– integration with Ilya Kreymer's pywb 
• User interface enhancements 
– ease of installation 
– intuitive GUI 
– configuration of Wayback display and Heritrix 
crawls 
November 18, 2014 Archive-It Partners Meeting 8
Bridging the gap between the past 
web and the live web 
Google Chrome extension 
For each page you visit, displays 
the number of archived versions 
available 
Provides access by date 
Allows for submission to public 
archiving services 
Mink 
Kelly, Nelson and Weigle, "Mink: Integrating the Live and Archived Web Viewing Experience Using Web Browsers and Memento," poster, ACM/IEEE Digital 
Libraries (DL), September 2014. 
November 18, 2014 Archive-It Partners Meeting 
9
Mink - Work in Progress 
• Pick public archives (Memento 
Aggregator) or private archive (local 
computer) 
November 18, 2014 Archive-It Partners Meeting 10
Tools 
Archive-It Partners Meeting 
WARCreate 
Mink 
WAIL 
11 
https://ws-dl.cs.odu.edu/Software 
November 18, 2014
Outline 
Start-Up and 
Implementation Grants 
– WARCreate 
– WAIL 
– Mink 
– Assessing Memento 
Damage 
Web Archiving Incentive 
– Thumbnail 
Summarization 
– Detecting Off-Topic 
Mementos 
WAIL Mink 
https://ws-dl.cs.odu.edu/Software 
WARCreate 
November 18, 2014 Archive-It Partners Meeting 12
How damaged are these mementos? 
M = percentage missing 
D = our damage metric 
Archive-It Partners Meeting 
M = 0.17 
D = 0.09 
(live web) 
M = 0.24 
D = 0.41 
(missing main) M = 0.29 
D = 0.36 
(missing logo + navigation) 
Brunelle, Kelly, SalahEldeen, Weigle, and Nelson, "Not All Mementos Are Created Equal: Measuring the Impact of Missing Resources", 
IEEE/ACM Digital Libraries (DL) 2014, Best Student Paper 
November 18, 2014 13
Good News: 
Although M is steady/increasing, D is decreasing 
November 18, 2014 Archive-It Partners Meeting 
14 
M = percentage missing 
D = our damage metric 
Sampled 45,000 URI-Ms 
- one URI-M each year of ~1850 URI-Rs 
- URI-Rs from Bitly URIs shared over Twitter and Archive-It collections
Outline 
Start-Up and 
Implementation Grants 
– WARCreate 
– WAIL 
– Mink 
– Assessing Memento 
Damage 
Web Archiving Incentive 
– Thumbnail 
Summarization 
– Detecting Off-Topic 
Mementos 
WAIL Mink 
https://ws-dl.cs.odu.edu/Software 
WARCreate 
November 18, 2014 Archive-It Partners Meeting 15
Browsing TimeMaps 
How were 
these 4 
thumbnails 
chosen? 
November 18, 2014 Archive-It Partners Meeting 16
Which tells you more about the 
past of www.apple.com? 
700 thumbnails 
(not even all of them!) 
November 18, 2014 Archive-It Partners Meeting 
32 sampled thumbnails 
17 
AlSum and Nelson, "Thumbnail Summarization Techniques for Web Archives", ECIR 2014
Thumbnail Summarization 
• Process 
– compare HTML of consecutive mementos 
• more efficient than image diff 
– when diff threshold passed, generate thumbnail 
– return data + thumbnail as JSON 
• Considerations 
– diff threshold too low -> near duplicate images 
– diff threshold too high -> miss important changes 
• Work in Progress 
– wayback plugin 
– embeddable version 
November 18, 2014 Archive-It Partners Meeting 18
Thumbnail Summary Screencast 
November 18, 2014 Archive-It Partners Meeting 19
Outline 
Start-Up and 
Implementation Grants 
– WARCreate 
– WAIL 
– Mink 
– Assessing Memento 
Damage 
Web Archiving Incentive 
– Thumbnail 
Summarization 
– Detecting Off-Topic 
Mementos 
WAIL Mink 
https://ws-dl.cs.odu.edu/Software 
WARCreate 
November 18, 2014 Archive-It Partners Meeting 20
Have you ever had this problem? 
May 21, 2012 
May 16, 2013 
nothing but spam 
November 18, 2014 Archive-It Partners Meeting 21
Detecting Off-Topic Mementos 
• Goal: Build a tool to alert curators of 
potential off-topic mementos in a collection 
• Compare text of mementos 
– Intersection of top terms (TF) 
– Cosine similarity 
– Jaccard similarity coefficient 
– Clustering with topic modeling 
November 18, 2014 Archive-It Partners Meeting 22
Test Collections 
November 18, 2014 Archive-It Partners Meeting 23
Turns out to be rather difficult 
• Egyptian Revolution 
– lots of non-English pages 
• Occupy Movement 
– lots of Facebook and social media pages 
– template extractors have trouble with these 
• Boston Marathon Bombing 
but we're making progress 
(stay tuned!) 
November 18, 2014 Archive-It Partners Meeting 24
Storytelling For Archives 
Storytelling services Archived collections 
Archived enriched 
stories 
AlNoamany, "Using Web Archives to Enrich the Live Web Experience Through Storytelling", TCDL Bulletin, December 2013. 
November 18, 2014 Archive-It Partners Meeting 25
Story Types 
Fixed Page – Fixed Time: 
differences in GeoIP, 
mobile, etc. 
Fixed Page – Sliding Time: 
evolution of a single page 
(or domain) through time 
Sliding Page – Fixed Time: 
different perspectives on a 
point in time 
Sliding Page – Sliding Time: 
broadest possible coverage 
of a collection 
same 
Time 
different 
URI 
same 
different 
Issues: topic modeling, eliminating duplicates, maximizing 
novelty, structural & content quality 
November 18, 2014 Archive-It Partners Meeting 26
Tools for Storytelling 
• Tools for Curators 
– create stories from your collections 
• candidate mementos automatically selected 
– use existing stories to augment your 
collections 
• Tools for Users 
– use existing tools like Storify to view the 
stories of a collection 
November 18, 2014 Archive-It Partners Meeting 27
Tools for Managing the Past Web 
Start-Up and Implementation 
Grants 
– WARCreate 
– WAIL 
– Mink 
– Assessing Memento Damage 
Web Archiving Incentive 
– Thumbnail Summarization 
– Detecting Off-Topic 
Mementos 
Web Science and Digital 
Libraries (WS-DL) Group 
@WebSciDL 
http://ws-dl.cs.odu.edu/ 
http://ws-dl.blogspot.com/ 
Michele C. Weigle 
mweigle@cs.odu.edu 
@weiglemc 
http://www.cs.odu.edu/~mweigle/ 
WAIL Mink 
WARCreate 
https://ws-dl.cs.odu.edu/Software 
November 18, 2014 Archive-It Partners Meeting 28

More Related Content

What's hot

Staying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsStaying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsC4Media
 
Drupal Open Source Everything
Drupal Open Source EverythingDrupal Open Source Everything
Drupal Open Source Everythinglibrarywebchic
 
Andrew Hoppin, CIO, NY State Senate
Andrew Hoppin, CIO, NY State SenateAndrew Hoppin, CIO, NY State Senate
Andrew Hoppin, CIO, NY State SenateAcquia
 
SAA 2015 Web Archiving Roundtable
SAA 2015 Web Archiving RoundtableSAA 2015 Web Archiving Roundtable
SAA 2015 Web Archiving Roundtablerosalielack
 
What Does DITA Have To Do With Wiki
What Does DITA Have To Do With WikiWhat Does DITA Have To Do With Wiki
What Does DITA Have To Do With WikiAnne Gentle
 
Levels of Service for Digital Libraries
Levels of Service for Digital LibrariesLevels of Service for Digital Libraries
Levels of Service for Digital LibrariesGreg Colati
 
Clicklaw wikibooks for beyond hope 2013
Clicklaw wikibooks for beyond hope 2013Clicklaw wikibooks for beyond hope 2013
Clicklaw wikibooks for beyond hope 2013Nathaniel Russell
 
Using Web 2.0 Principles to Become Librarian 2.0: Wikis
Using Web 2.0 Principles to Become Librarian 2.0: WikisUsing Web 2.0 Principles to Become Librarian 2.0: Wikis
Using Web 2.0 Principles to Become Librarian 2.0: WikisBrian Gray
 
Svenska Yle metadata and data first
Svenska Yle metadata and data firstSvenska Yle metadata and data first
Svenska Yle metadata and data firstMicke Hindsberg
 
Health Dept Wiki
Health Dept WikiHealth Dept Wiki
Health Dept Wikiswhite58
 
NCBO Web services: Powering Semantically Aware Applications
NCBO Web services: Powering Semantically Aware ApplicationsNCBO Web services: Powering Semantically Aware Applications
NCBO Web services: Powering Semantically Aware ApplicationsTrish Whetzel
 

What's hot (17)

Staying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsStaying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to Streams
 
Drupal Open Source Everything
Drupal Open Source EverythingDrupal Open Source Everything
Drupal Open Source Everything
 
Data visualization and school finance
Data visualization and school financeData visualization and school finance
Data visualization and school finance
 
Cyberlaw presentation
Cyberlaw presentationCyberlaw presentation
Cyberlaw presentation
 
Andrew Hoppin, CIO, NY State Senate
Andrew Hoppin, CIO, NY State SenateAndrew Hoppin, CIO, NY State Senate
Andrew Hoppin, CIO, NY State Senate
 
SAA 2015 Web Archiving Roundtable
SAA 2015 Web Archiving RoundtableSAA 2015 Web Archiving Roundtable
SAA 2015 Web Archiving Roundtable
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
What Does DITA Have To Do With Wiki
What Does DITA Have To Do With WikiWhat Does DITA Have To Do With Wiki
What Does DITA Have To Do With Wiki
 
Levels of Service for Digital Libraries
Levels of Service for Digital LibrariesLevels of Service for Digital Libraries
Levels of Service for Digital Libraries
 
Clicklaw wikibooks for beyond hope 2013
Clicklaw wikibooks for beyond hope 2013Clicklaw wikibooks for beyond hope 2013
Clicklaw wikibooks for beyond hope 2013
 
Website Mashup
Website MashupWebsite Mashup
Website Mashup
 
Using Web 2.0 Principles to Become Librarian 2.0: Wikis
Using Web 2.0 Principles to Become Librarian 2.0: WikisUsing Web 2.0 Principles to Become Librarian 2.0: Wikis
Using Web 2.0 Principles to Become Librarian 2.0: Wikis
 
STI2 Board Meeting 2011 - US strategy
STI2 Board Meeting 2011 - US strategySTI2 Board Meeting 2011 - US strategy
STI2 Board Meeting 2011 - US strategy
 
Svenska Yle metadata and data first
Svenska Yle metadata and data firstSvenska Yle metadata and data first
Svenska Yle metadata and data first
 
Health Dept Wiki
Health Dept WikiHealth Dept Wiki
Health Dept Wiki
 
NCBO Web services: Powering Semantically Aware Applications
NCBO Web services: Powering Semantically Aware ApplicationsNCBO Web services: Powering Semantically Aware Applications
NCBO Web services: Powering Semantically Aware Applications
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 

Similar to Tools for Managing the Past Web

Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...The Frick Collection
 
Filling in the Blanks: Capturing Dynamically Generated Content
Filling in the Blanks: Capturing Dynamically Generated ContentFilling in the Blanks: Capturing Dynamically Generated Content
Filling in the Blanks: Capturing Dynamically Generated ContentJustin Brunelle
 
Introduction to Omeka
Introduction to OmekaIntroduction to Omeka
Introduction to OmekaShawn Day
 
Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemorySamantha Norling
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_trainingssouthwick
 
SEMLIB Final Conference | UNIVPM presentation
SEMLIB Final Conference | UNIVPM presentationSEMLIB Final Conference | UNIVPM presentation
SEMLIB Final Conference | UNIVPM presentationSemLib Project
 
SEMLIB - final conference - UNIVPM presentation
SEMLIB - final conference - UNIVPM presentationSEMLIB - final conference - UNIVPM presentation
SEMLIB - final conference - UNIVPM presentationChristian Morbidoni
 
Webrecorder: Building, Maintaining & Growing
Webrecorder: Building, Maintaining & GrowingWebrecorder: Building, Maintaining & Growing
Webrecorder: Building, Maintaining & GrowingAnna Perricci
 
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...DuraSpace
 
Using Omeka as a Gateway to Digital Projects
Using Omeka as a Gateway to Digital ProjectsUsing Omeka as a Gateway to Digital Projects
Using Omeka as a Gateway to Digital Projectslibrarianrafia
 
Snrg2011 6.15.2.sta canney_suranofsky
Snrg2011 6.15.2.sta canney_suranofskySnrg2011 6.15.2.sta canney_suranofsky
Snrg2011 6.15.2.sta canney_suranofskykaran saini
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunitiesAhmed AlSum
 
Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Anna Perricci
 
Social Contexts of Web Archiving: Collaboration and Ethical Collection Building
Social Contexts of Web Archiving: Collaboration and  Ethical Collection BuildingSocial Contexts of Web Archiving: Collaboration and  Ethical Collection Building
Social Contexts of Web Archiving: Collaboration and Ethical Collection BuildingAnna Perricci
 
Introduction to Web Archiving
Introduction to Web ArchivingIntroduction to Web Archiving
Introduction to Web ArchivingAnna Perricci
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsAnna Perricci
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalPromet Source
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalAndy Kucharski
 
Paul hu bupdate_i_digbio_ecn_2012
Paul hu bupdate_i_digbio_ecn_2012Paul hu bupdate_i_digbio_ecn_2012
Paul hu bupdate_i_digbio_ecn_2012ECNOfficer
 

Similar to Tools for Managing the Past Web (20)

Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
Making the Black Hole Gray: Web Archiving Art Resources at New York Art Resou...
 
Filling in the Blanks: Capturing Dynamically Generated Content
Filling in the Blanks: Capturing Dynamically Generated ContentFilling in the Blanks: Capturing Dynamically Generated Content
Filling in the Blanks: Capturing Dynamically Generated Content
 
Introduction to Omeka
Introduction to OmekaIntroduction to Omeka
Introduction to Omeka
 
Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional Memory
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
 
SEMLIB Final Conference | UNIVPM presentation
SEMLIB Final Conference | UNIVPM presentationSEMLIB Final Conference | UNIVPM presentation
SEMLIB Final Conference | UNIVPM presentation
 
SEMLIB - final conference - UNIVPM presentation
SEMLIB - final conference - UNIVPM presentationSEMLIB - final conference - UNIVPM presentation
SEMLIB - final conference - UNIVPM presentation
 
Webrecorder: Building, Maintaining & Growing
Webrecorder: Building, Maintaining & GrowingWebrecorder: Building, Maintaining & Growing
Webrecorder: Building, Maintaining & Growing
 
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
11.12.14 Slides: “Doing It: Trends Toward Hosted Service Adoption and Impleme...
 
Using Omeka as a Gateway to Digital Projects
Using Omeka as a Gateway to Digital ProjectsUsing Omeka as a Gateway to Digital Projects
Using Omeka as a Gateway to Digital Projects
 
Snrg2011 6.15.2.sta canney_suranofsky
Snrg2011 6.15.2.sta canney_suranofskySnrg2011 6.15.2.sta canney_suranofsky
Snrg2011 6.15.2.sta canney_suranofsky
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...
 
Social Contexts of Web Archiving: Collaboration and Ethical Collection Building
Social Contexts of Web Archiving: Collaboration and  Ethical Collection BuildingSocial Contexts of Web Archiving: Collaboration and  Ethical Collection Building
Social Contexts of Web Archiving: Collaboration and Ethical Collection Building
 
Introduction to Web Archiving
Introduction to Web ArchivingIntroduction to Web Archiving
Introduction to Web Archiving
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive Awards
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for Drupal
 
OMEKA
OMEKAOMEKA
OMEKA
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for Drupal
 
Paul hu bupdate_i_digbio_ecn_2012
Paul hu bupdate_i_digbio_ecn_2012Paul hu bupdate_i_digbio_ecn_2012
Paul hu bupdate_i_digbio_ecn_2012
 

More from Michele Weigle

Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Michele Weigle
 
WS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesWS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesMichele Weigle
 
Intro to Web Archiving
Intro to Web ArchivingIntro to Web Archiving
Intro to Web ArchivingMichele Weigle
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesMichele Weigle
 
Visualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeVisualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeMichele Weigle
 
How to Write an Academic Paper
How to Write an Academic PaperHow to Write an Academic Paper
How to Write an Academic PaperMichele Weigle
 
How to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationHow to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationMichele Weigle
 
My Academic Story via Internet Archive
My Academic Story via Internet ArchiveMy Academic Story via Internet Archive
My Academic Story via Internet ArchiveMichele Weigle
 
A Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksA Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksMichele Weigle
 
Strategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseStrategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseMichele Weigle
 
Detecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCDetecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCMichele Weigle
 
Energy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksEnergy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksMichele Weigle
 
2015-capwic-gradschool
2015-capwic-gradschool2015-capwic-gradschool
2015-capwic-gradschoolMichele Weigle
 
2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-webMichele Weigle
 
Telling Stories with Web Archives
Telling Stories with Web ArchivesTelling Stories with Web Archives
Telling Stories with Web ArchivesMichele Weigle
 
TDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsTDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsMichele Weigle
 
Visualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItVisualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItMichele Weigle
 
Information Visualization - Visualizing Digital Collections at Archive-It
Information Visualization - Visualizing Digital Collections at Archive-ItInformation Visualization - Visualizing Digital Collections at Archive-It
Information Visualization - Visualizing Digital Collections at Archive-ItMichele Weigle
 
Communications and Energy-Harvesting in Nanosensor Networks
Communications and Energy-Harvesting in Nanosensor NetworksCommunications and Energy-Harvesting in Nanosensor Networks
Communications and Energy-Harvesting in Nanosensor NetworksMichele Weigle
 

More from Michele Weigle (20)

Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
 
WS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesWS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web Archives
 
Intro to Web Archiving
Intro to Web ArchivingIntro to Web Archiving
Intro to Web Archiving
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web Archives
 
Visualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeVisualizing Webpage Changes Over Time
Visualizing Webpage Changes Over Time
 
How to Write an Academic Paper
How to Write an Academic PaperHow to Write an Academic Paper
How to Write an Academic Paper
 
How to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationHow to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic Presentation
 
My Academic Story via Internet Archive
My Academic Story via Internet ArchiveMy Academic Story via Internet Archive
My Academic Story via Internet Archive
 
A Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksA Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor Networks
 
Strategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseStrategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency Response
 
Detecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCDetecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARC
 
Energy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksEnergy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless Nanonetworks
 
2015-capwic-gradschool
2015-capwic-gradschool2015-capwic-gradschool
2015-capwic-gradschool
 
2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web
 
Bits of Research
Bits of ResearchBits of Research
Bits of Research
 
Telling Stories with Web Archives
Telling Stories with Web ArchivesTelling Stories with Web Archives
Telling Stories with Web Archives
 
TDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsTDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETs
 
Visualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItVisualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-It
 
Information Visualization - Visualizing Digital Collections at Archive-It
Information Visualization - Visualizing Digital Collections at Archive-ItInformation Visualization - Visualizing Digital Collections at Archive-It
Information Visualization - Visualizing Digital Collections at Archive-It
 
Communications and Energy-Harvesting in Nanosensor Networks
Communications and Energy-Harvesting in Nanosensor NetworksCommunications and Energy-Harvesting in Nanosensor Networks
Communications and Energy-Harvesting in Nanosensor Networks
 

Recently uploaded

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Tools for Managing the Past Web

  • 1. Tools for Managing the Past Web Michele C. Weigle Web Sciences and Digital Libraries (WS-DL) Group Department of Computer Science Old Dominion University Norfolk, VA Includes joint work with Michael L. Nelson and our PhD students, Yasmin AlNoamany, Ahmed AlSum (PhD 2014), Justin Brunelle, Mat Kelly, Hany SalahEldeen Archive-It Partners Meeting November 18, 2014
  • 2. Outline Start-Up and Implementation Grants – WARCreate – WAIL – Mink – Assessing Memento Damage Web Archiving Incentive – Thumbnail Summarization – Detecting Off-Topic Mementos WARCreate WAIL Mink https://ws-dl.cs.odu.edu/Software November 18, 2014 Archive-It Partners Meeting 2
  • 3. Archive What I See Now • Standard web archiving tools are difficult for non IT experts. • "Save Page As" is not suitable for archiving purposes. • Pages are behind authentication. • Pages change quickly, but current state needs archiving. NEH Digital Humanities Implementation Grant, 2014-2017, http://bit.ly/odu-dhig-2014 November 18, 2014 Archive-It Partners Meeting 3
  • 4. How we're addressing the problem Google Chrome extension Archive the current state of the page in standard Web Archive (WARC) format Compatible with Wayback WARCreate Kelly and Weigle, "WARCreate - Create Wayback-Consumable WARC Files from Any Webpage", JCDL 2012 Kelly, Weigle, and Nelson. "WARCreate - Create Wayback-Consumable WARC Files from Any Webpage," Digital Preservation 2012, Tools Demo Session 4 November 18, 2014 Archive-It Partners Meeting
  • 5. WARCreate - Work in Progress • New modes of operation – record mode • while activated, add capture of each page visited to the WARC – countdown mode • every interval, refresh and add new capture of page – event mode • add new capture of page every time it dynamically reloads or refreshes November 18, 2014 Archive-It Partners Meeting 5
  • 6. WARCreate - Work in Progress • Uploading created WARCs to Archive-It or other archives – consideration of data integrity – merging local WARCs with crawled WARCs • how do we account for your www.facebook.com vs. my www.facebook.com? – privacy November 18, 2014 Archive-It Partners Meeting 6
  • 7. What to do with created WARCs? WAIL Load created WARCs into a Wayback instance on your local computer Single-click install of Wayback (and other archiving tools) Includes IIPC's OpenWayback 2.0 and Heritrix 3.2 Available for Windows, OS X (Linux coming soon!) Kelly, Weigle, and Nelson. "Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving," Personal Digital Archiving 2013, Poster Session Kelly, Nelson, and Weigle. "WARCreate and WAIL: WARC, Wayback and Heritrix Made Easy," Digital Preservation 2013 November 18, 2014 Archive-It Partners Meeting 7
  • 8. WAIL - Work in Progress • More tools – integration with Ilya Kreymer's pywb • User interface enhancements – ease of installation – intuitive GUI – configuration of Wayback display and Heritrix crawls November 18, 2014 Archive-It Partners Meeting 8
  • 9. Bridging the gap between the past web and the live web Google Chrome extension For each page you visit, displays the number of archived versions available Provides access by date Allows for submission to public archiving services Mink Kelly, Nelson and Weigle, "Mink: Integrating the Live and Archived Web Viewing Experience Using Web Browsers and Memento," poster, ACM/IEEE Digital Libraries (DL), September 2014. November 18, 2014 Archive-It Partners Meeting 9
  • 10. Mink - Work in Progress • Pick public archives (Memento Aggregator) or private archive (local computer) November 18, 2014 Archive-It Partners Meeting 10
  • 11. Tools Archive-It Partners Meeting WARCreate Mink WAIL 11 https://ws-dl.cs.odu.edu/Software November 18, 2014
  • 12. Outline Start-Up and Implementation Grants – WARCreate – WAIL – Mink – Assessing Memento Damage Web Archiving Incentive – Thumbnail Summarization – Detecting Off-Topic Mementos WAIL Mink https://ws-dl.cs.odu.edu/Software WARCreate November 18, 2014 Archive-It Partners Meeting 12
  • 13. How damaged are these mementos? M = percentage missing D = our damage metric Archive-It Partners Meeting M = 0.17 D = 0.09 (live web) M = 0.24 D = 0.41 (missing main) M = 0.29 D = 0.36 (missing logo + navigation) Brunelle, Kelly, SalahEldeen, Weigle, and Nelson, "Not All Mementos Are Created Equal: Measuring the Impact of Missing Resources", IEEE/ACM Digital Libraries (DL) 2014, Best Student Paper November 18, 2014 13
  • 14. Good News: Although M is steady/increasing, D is decreasing November 18, 2014 Archive-It Partners Meeting 14 M = percentage missing D = our damage metric Sampled 45,000 URI-Ms - one URI-M each year of ~1850 URI-Rs - URI-Rs from Bitly URIs shared over Twitter and Archive-It collections
  • 15. Outline Start-Up and Implementation Grants – WARCreate – WAIL – Mink – Assessing Memento Damage Web Archiving Incentive – Thumbnail Summarization – Detecting Off-Topic Mementos WAIL Mink https://ws-dl.cs.odu.edu/Software WARCreate November 18, 2014 Archive-It Partners Meeting 15
  • 16. Browsing TimeMaps How were these 4 thumbnails chosen? November 18, 2014 Archive-It Partners Meeting 16
  • 17. Which tells you more about the past of www.apple.com? 700 thumbnails (not even all of them!) November 18, 2014 Archive-It Partners Meeting 32 sampled thumbnails 17 AlSum and Nelson, "Thumbnail Summarization Techniques for Web Archives", ECIR 2014
  • 18. Thumbnail Summarization • Process – compare HTML of consecutive mementos • more efficient than image diff – when diff threshold passed, generate thumbnail – return data + thumbnail as JSON • Considerations – diff threshold too low -> near duplicate images – diff threshold too high -> miss important changes • Work in Progress – wayback plugin – embeddable version November 18, 2014 Archive-It Partners Meeting 18
  • 19. Thumbnail Summary Screencast November 18, 2014 Archive-It Partners Meeting 19
  • 20. Outline Start-Up and Implementation Grants – WARCreate – WAIL – Mink – Assessing Memento Damage Web Archiving Incentive – Thumbnail Summarization – Detecting Off-Topic Mementos WAIL Mink https://ws-dl.cs.odu.edu/Software WARCreate November 18, 2014 Archive-It Partners Meeting 20
  • 21. Have you ever had this problem? May 21, 2012 May 16, 2013 nothing but spam November 18, 2014 Archive-It Partners Meeting 21
  • 22. Detecting Off-Topic Mementos • Goal: Build a tool to alert curators of potential off-topic mementos in a collection • Compare text of mementos – Intersection of top terms (TF) – Cosine similarity – Jaccard similarity coefficient – Clustering with topic modeling November 18, 2014 Archive-It Partners Meeting 22
  • 23. Test Collections November 18, 2014 Archive-It Partners Meeting 23
  • 24. Turns out to be rather difficult • Egyptian Revolution – lots of non-English pages • Occupy Movement – lots of Facebook and social media pages – template extractors have trouble with these • Boston Marathon Bombing but we're making progress (stay tuned!) November 18, 2014 Archive-It Partners Meeting 24
  • 25. Storytelling For Archives Storytelling services Archived collections Archived enriched stories AlNoamany, "Using Web Archives to Enrich the Live Web Experience Through Storytelling", TCDL Bulletin, December 2013. November 18, 2014 Archive-It Partners Meeting 25
  • 26. Story Types Fixed Page – Fixed Time: differences in GeoIP, mobile, etc. Fixed Page – Sliding Time: evolution of a single page (or domain) through time Sliding Page – Fixed Time: different perspectives on a point in time Sliding Page – Sliding Time: broadest possible coverage of a collection same Time different URI same different Issues: topic modeling, eliminating duplicates, maximizing novelty, structural & content quality November 18, 2014 Archive-It Partners Meeting 26
  • 27. Tools for Storytelling • Tools for Curators – create stories from your collections • candidate mementos automatically selected – use existing stories to augment your collections • Tools for Users – use existing tools like Storify to view the stories of a collection November 18, 2014 Archive-It Partners Meeting 27
  • 28. Tools for Managing the Past Web Start-Up and Implementation Grants – WARCreate – WAIL – Mink – Assessing Memento Damage Web Archiving Incentive – Thumbnail Summarization – Detecting Off-Topic Mementos Web Science and Digital Libraries (WS-DL) Group @WebSciDL http://ws-dl.cs.odu.edu/ http://ws-dl.blogspot.com/ Michele C. Weigle mweigle@cs.odu.edu @weiglemc http://www.cs.odu.edu/~mweigle/ WAIL Mink WARCreate https://ws-dl.cs.odu.edu/Software November 18, 2014 Archive-It Partners Meeting 28