SlideShare a Scribd company logo
Alexander Nwala
Computer Science Ph.D student, Old Dominion University
Summer Fellow, Harvard Law School Library Innovation Lab
Dr. Michael Nelson
Department of Computer Science, Old Dominion University
Archives Unleashed 2.0: Web Archive Datathon,Washington DC
June 14, 2016
Generating collections for stories and events
• Carbon date is a tool for estimating the
creation date of a website
• Returns a machine-readable structure
Website: http://cd.cs.odu.edu/
Carbon Date #WhatDidItLookLike
• Tumblr blog shows what a website looked like
across multiple years
• Nominate websites to What Did It Look Like? by
tweeting: “#whatdiditlooklike URL”
Website: http://whatdiditlooklike.mementoweb.org/
Previous Projects
• Returns an archived version of the page closest
to the time of the tweet or
• Returns a newly archived version of the page, if
the page was not archived 24 hours ago
Website: https://twitter.com/icanhazmemento/
#ICanHazMemento Web Query Classifier
• Classifies a web query as scholar or non-scholar
• Route the query to a local Digital Library not
crawlable via Search Engines
Tech report: https://arxiv.org/abs/1605.00184
Previous Projects
Generating collections for stories and events
• It’s not difficult to collect resources (links) for a story, but can we build a good
collection?
• What does good mean?
• What are some properties of good stories?
Is this story from Storify good? Is this story from Storify good?
Generating collections for stories and events
Where to begin
• Understanding potential sources; Social Media (Storify, Twitter,
etc) and News.
Storify native search does not find Stories
There is no neat way of collecting tweets in a
conversation, since the subset
of the graph seen depends on what tweet is
selected.

More Related Content

Viewers also liked

disaster-relief.pptx
disaster-relief.pptxdisaster-relief.pptx
disaster-relief.pptx
jk tan
 
Commercial_Book_Writing
Commercial_Book_WritingCommercial_Book_Writing
Commercial_Book_Writing
Terri Porter
 

Viewers also liked (19)

Cerif tutorial from CRIS2016
Cerif tutorial from CRIS2016Cerif tutorial from CRIS2016
Cerif tutorial from CRIS2016
 
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
November 10, 2015 NISO/ICSTI Joint Webinar: A Pathway from Open Access and Da...
 
FTC8 Amaury Grimbert - Review of the OP3FT mission and it's work - 2016/10/11
FTC8 Amaury Grimbert - Review of the OP3FT mission and it's work - 2016/10/11FTC8 Amaury Grimbert - Review of the OP3FT mission and it's work - 2016/10/11
FTC8 Amaury Grimbert - Review of the OP3FT mission and it's work - 2016/10/11
 
RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?
 
Museums home of unlinked data - Richard Light - ukmw15 provocations
Museums home of unlinked data - Richard Light - ukmw15 provocationsMuseums home of unlinked data - Richard Light - ukmw15 provocations
Museums home of unlinked data - Richard Light - ukmw15 provocations
 
Ενεργός πολίτης - εργασία στην Κ.Π.Α. Γ΄ Γυμν._Τιμαμόπουλος, Πλιάμη, Σεραφειμ...
Ενεργός πολίτης - εργασία στην Κ.Π.Α. Γ΄ Γυμν._Τιμαμόπουλος, Πλιάμη, Σεραφειμ...Ενεργός πολίτης - εργασία στην Κ.Π.Α. Γ΄ Γυμν._Τιμαμόπουλος, Πλιάμη, Σεραφειμ...
Ενεργός πολίτης - εργασία στην Κ.Π.Α. Γ΄ Γυμν._Τιμαμόπουλος, Πλιάμη, Σεραφειμ...
 
Presentación cremas Herbalife
Presentación cremas HerbalifePresentación cremas Herbalife
Presentación cremas Herbalife
 
Listado Ejercicios Básicos Java 4
Listado Ejercicios Básicos Java 4Listado Ejercicios Básicos Java 4
Listado Ejercicios Básicos Java 4
 
Successful Outsourcing Transitions Webinar Presentation
Successful Outsourcing Transitions Webinar PresentationSuccessful Outsourcing Transitions Webinar Presentation
Successful Outsourcing Transitions Webinar Presentation
 
Ejercicios de Java Básico. Listado 1 de Ejercicios.Programación.
Ejercicios de Java Básico. Listado 1 de Ejercicios.Programación.Ejercicios de Java Básico. Listado 1 de Ejercicios.Programación.
Ejercicios de Java Básico. Listado 1 de Ejercicios.Programación.
 
CBE16 - Content Marketing: The Not so Secret Weapon of Craft
CBE16 - Content Marketing: The Not so Secret Weapon of CraftCBE16 - Content Marketing: The Not so Secret Weapon of Craft
CBE16 - Content Marketing: The Not so Secret Weapon of Craft
 
Операционная эффективность в Call-центре
Операционная эффективность в Call-центреОперационная эффективность в Call-центре
Операционная эффективность в Call-центре
 
disaster-relief.pptx
disaster-relief.pptxdisaster-relief.pptx
disaster-relief.pptx
 
Cooperative learning refleksivt team marts 2015
Cooperative learning refleksivt team marts 2015Cooperative learning refleksivt team marts 2015
Cooperative learning refleksivt team marts 2015
 
Commercial_Book_Writing
Commercial_Book_WritingCommercial_Book_Writing
Commercial_Book_Writing
 
vocabulary
vocabulary vocabulary
vocabulary
 
Cleaning Junk Files from Windows Computer
Cleaning Junk Files from Windows ComputerCleaning Junk Files from Windows Computer
Cleaning Junk Files from Windows Computer
 
vikram_cv1._(1)[1]
vikram_cv1._(1)[1]vikram_cv1._(1)[1]
vikram_cv1._(1)[1]
 
Presentation
PresentationPresentation
Presentation
 

Similar to Generating collections for stories and events

Drupal Open Source Everything
Drupal Open Source EverythingDrupal Open Source Everything
Drupal Open Source Everything
librarywebchic
 

Similar to Generating collections for stories and events (20)

Building A Virtual Learning Commons
Building A Virtual Learning CommonsBuilding A Virtual Learning Commons
Building A Virtual Learning Commons
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 
Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...Information sharing about Columbia University Library’s recent web archiving ...
Information sharing about Columbia University Library’s recent web archiving ...
 
Drupal Open Source Everything
Drupal Open Source EverythingDrupal Open Source Everything
Drupal Open Source Everything
 
Digital collections: Increasing awareness and use
Digital collections:  Increasing awareness and useDigital collections:  Increasing awareness and use
Digital collections: Increasing awareness and use
 
Library Support for Journal Publishing: Emphasis on multi-modal open peer rev...
Library Support for Journal Publishing: Emphasis on multi-modal open peer rev...Library Support for Journal Publishing: Emphasis on multi-modal open peer rev...
Library Support for Journal Publishing: Emphasis on multi-modal open peer rev...
 
Wikipedia & Cultural Heritage Institutions: Opportunities for Partnership
Wikipedia & Cultural Heritage Institutions: Opportunities for PartnershipWikipedia & Cultural Heritage Institutions: Opportunities for Partnership
Wikipedia & Cultural Heritage Institutions: Opportunities for Partnership
 
SLA Presentation - Institutional Partnerships with Wikipedia
SLA Presentation - Institutional Partnerships with Wikipedia SLA Presentation - Institutional Partnerships with Wikipedia
SLA Presentation - Institutional Partnerships with Wikipedia
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
NCompass Live: Learning Opportunities and Resources from WebJunction
NCompass Live: Learning Opportunities and Resources from WebJunctionNCompass Live: Learning Opportunities and Resources from WebJunction
NCompass Live: Learning Opportunities and Resources from WebJunction
 
Eastern Shores Library System digitization project
Eastern Shores Library System digitization projectEastern Shores Library System digitization project
Eastern Shores Library System digitization project
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive Awards
 
Web2
Web2Web2
Web2
 
Better Management Through Web 2.0 6 Nov 2008
Better Management Through Web 2.0 6 Nov 2008Better Management Through Web 2.0 6 Nov 2008
Better Management Through Web 2.0 6 Nov 2008
 
Doing DH in Theological Libraries
Doing DH in Theological LibrariesDoing DH in Theological Libraries
Doing DH in Theological Libraries
 
Building Web Archiving Collaborations to Save [More of] the Web
Building Web Archiving Collaborations to Save [More of] the WebBuilding Web Archiving Collaborations to Save [More of] the Web
Building Web Archiving Collaborations to Save [More of] the Web
 
Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!Web 2.0...it’s okay to play!
Web 2.0...it’s okay to play!
 
OER: What are they and how can I use them?
OER: What are they and how can I use them?OER: What are they and how can I use them?
OER: What are they and how can I use them?
 
Internet skills for researchers oct11
Internet skills for researchers oct11Internet skills for researchers oct11
Internet skills for researchers oct11
 
Capture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingCapture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web Archiving
 

More from Alexander Nwala

More from Alexander Nwala (7)

Scraping SERPs For Archival Seeds - It Matters When You Start
Scraping SERPs For Archival Seeds - It Matters When You StartScraping SERPs For Archival Seeds - It Matters When You Start
Scraping SERPs For Archival Seeds - It Matters When You Start
 
Bootstrapping Web Archive Collections of Stories from Micro-collections in S...
Bootstrapping Web Archive Collections  of Stories from Micro-collections in S...Bootstrapping Web Archive Collections  of Stories from Micro-collections in S...
Bootstrapping Web Archive Collections of Stories from Micro-collections in S...
 
Local Memory Project
Local Memory ProjectLocal Memory Project
Local Memory Project
 
Tweet Visibility Dynamics in a Tweet Conversation Graph
Tweet Visibility Dynamics in a Tweet Conversation GraphTweet Visibility Dynamics in a Tweet Conversation Graph
Tweet Visibility Dynamics in a Tweet Conversation Graph
 
Jcdl2016_keynote-zemankova
Jcdl2016_keynote-zemankovaJcdl2016_keynote-zemankova
Jcdl2016_keynote-zemankova
 
Tracking discourse on social media
Tracking discourse on social mediaTracking discourse on social media
Tracking discourse on social media
 
Information Visualization Project
Information Visualization ProjectInformation Visualization Project
Information Visualization Project
 

Recently uploaded

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 

Recently uploaded (20)

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 

Generating collections for stories and events

  • 1. Alexander Nwala Computer Science Ph.D student, Old Dominion University Summer Fellow, Harvard Law School Library Innovation Lab Dr. Michael Nelson Department of Computer Science, Old Dominion University Archives Unleashed 2.0: Web Archive Datathon,Washington DC June 14, 2016 Generating collections for stories and events
  • 2. • Carbon date is a tool for estimating the creation date of a website • Returns a machine-readable structure Website: http://cd.cs.odu.edu/ Carbon Date #WhatDidItLookLike • Tumblr blog shows what a website looked like across multiple years • Nominate websites to What Did It Look Like? by tweeting: “#whatdiditlooklike URL” Website: http://whatdiditlooklike.mementoweb.org/ Previous Projects
  • 3. • Returns an archived version of the page closest to the time of the tweet or • Returns a newly archived version of the page, if the page was not archived 24 hours ago Website: https://twitter.com/icanhazmemento/ #ICanHazMemento Web Query Classifier • Classifies a web query as scholar or non-scholar • Route the query to a local Digital Library not crawlable via Search Engines Tech report: https://arxiv.org/abs/1605.00184 Previous Projects
  • 4. Generating collections for stories and events • It’s not difficult to collect resources (links) for a story, but can we build a good collection? • What does good mean? • What are some properties of good stories? Is this story from Storify good? Is this story from Storify good?
  • 5. Generating collections for stories and events Where to begin • Understanding potential sources; Social Media (Storify, Twitter, etc) and News. Storify native search does not find Stories There is no neat way of collecting tweets in a conversation, since the subset of the graph seen depends on what tweet is selected.