SlideShare a Scribd company logo
Problems and Issues in Selecting, Harvesting, and Cataloging Web Resources  Joanne Archer and John Schalow University of Maryland Libraries
Jargon Crawler Web Harvesting Seed Harvest Crawl
Wayback Machine
Options for Web Harvesting In House  Program i.e.  Pandora, Web Curator Tool Pro:  flexibility Con: $$$ i.e. HTTrack, Adobe Web Capture Pro: inexpensive Con: not-scalable Off the  Shelf  Software Third  Party Subscription i.e. Web Archiving Service Archive-It Pro: Ease-of-use Con: $
Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
Maryland’s Pilot Harvests (2008-2010) Historic Preservation Maryland State Documents
Why harvest these areas? ,[object Object],[object Object],[object Object]
Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
Harvesting
Harvesting Challenges: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Single host =  www.preservemd.org Multiple hosts =  www.umd.edu www.lib.umd.edu
End-User Access
End-User Access collection note subject heading general material designation URLs uniform title
Conclusions ,[object Object],[object Object],[object Object],[object Object],BUT  We are well prepared to meet the  challenges
Questions?  ,[object Object],[object Object]

More Related Content

What's hot

Digital library in mason
Digital library in masonDigital library in mason
Digital library in mason
zchen34
 
Workshop on Discovery of Library Resources
Workshop on Discovery of Library ResourcesWorkshop on Discovery of Library Resources
Workshop on Discovery of Library Resources
Nur Ahammad
 
Accessibility Compliance: One State, Two Approaches
Accessibility Compliance: One State, Two ApproachesAccessibility Compliance: One State, Two Approaches
Accessibility Compliance: One State, Two Approaches
NASIG
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
Yasmin AlNoamany, PhD
 
Rda update
Rda updateRda update
Rda update
Jennifer Baxmeyer
 
Training daypresentation
Training daypresentationTraining daypresentation
Training daypresentation
Amy Fry
 
Web Archiving Profile - WADL 2013
Web Archiving Profile - WADL 2013Web Archiving Profile - WADL 2013
Web Archiving Profile - WADL 2013
Ahmed AlSum
 
Cataloguer Makeover
Cataloguer MakeoverCataloguer Makeover
Cataloguer Makeover
Violeta Ilik
 
2009 IDS Search
2009 IDS Search2009 IDS Search
2009 IDS Search
Mike Curtis
 
Rantlha research pro
Rantlha research proRantlha research pro
Rantlha research pro
FOTIM
 
Lowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the webLowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the web
Brad Chapman
 
E-LEARN: The Evans Library Website Overview
E-LEARN: The Evans Library Website OverviewE-LEARN: The Evans Library Website Overview
E-LEARN: The Evans Library Website Overview
Rose Petralia
 
Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...
WiLS
 
A Living Archive
A Living ArchiveA Living Archive
A Living Archive
hollybirk
 
Wc042410
Wc042410Wc042410

What's hot (15)

Digital library in mason
Digital library in masonDigital library in mason
Digital library in mason
 
Workshop on Discovery of Library Resources
Workshop on Discovery of Library ResourcesWorkshop on Discovery of Library Resources
Workshop on Discovery of Library Resources
 
Accessibility Compliance: One State, Two Approaches
Accessibility Compliance: One State, Two ApproachesAccessibility Compliance: One State, Two Approaches
Accessibility Compliance: One State, Two Approaches
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
 
Rda update
Rda updateRda update
Rda update
 
Training daypresentation
Training daypresentationTraining daypresentation
Training daypresentation
 
Web Archiving Profile - WADL 2013
Web Archiving Profile - WADL 2013Web Archiving Profile - WADL 2013
Web Archiving Profile - WADL 2013
 
Cataloguer Makeover
Cataloguer MakeoverCataloguer Makeover
Cataloguer Makeover
 
2009 IDS Search
2009 IDS Search2009 IDS Search
2009 IDS Search
 
Rantlha research pro
Rantlha research proRantlha research pro
Rantlha research pro
 
Lowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the webLowering barriers to publishing biological data on the web
Lowering barriers to publishing biological data on the web
 
E-LEARN: The Evans Library Website Overview
E-LEARN: The Evans Library Website OverviewE-LEARN: The Evans Library Website Overview
E-LEARN: The Evans Library Website Overview
 
Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...
 
A Living Archive
A Living ArchiveA Living Archive
A Living Archive
 
Wc042410
Wc042410Wc042410
Wc042410
 

Viewers also liked

Rutas
RutasRutas
innovatorsofhyd
innovatorsofhydinnovatorsofhyd
Recognition of poverty (citizenship stage 2)
Recognition of poverty (citizenship stage 2)Recognition of poverty (citizenship stage 2)
Recognition of poverty (citizenship stage 2)
Timothy James Chong
 
ประวัติสุภัสสร61
ประวัติสุภัสสร61ประวัติสุภัสสร61
ประวัติสุภัสสร61Supassron Thongnuch
 
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
eMadrid network
 
Tarea de calculo
Tarea de calculoTarea de calculo
Tarea de calculo
Alvaro Vargas Barrera
 
Açıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
Açıköğretim e-Öğrenme Yapım ve Sunum AltyapısıAçıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
Açıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
Mehmet Emin Mutlu
 
Sektör Haberleri 21 Aralık 2012
Sektör Haberleri 21 Aralık 2012Sektör Haberleri 21 Aralık 2012
Sektör Haberleri 21 Aralık 2012Vizeum Turkiye
 
Chemistry:Air
Chemistry:AirChemistry:Air
Chemistry:Air
Red Falcon DL
 
Многообразие живых организмов
Многообразие живых организмовМногообразие живых организмов
Многообразие живых организмов
LotosPlay
 
0910 F 01 Blog
0910 F 01 Blog0910 F 01 Blog
Warnings
WarningsWarnings
Practicing communication
Practicing communicationPracticing communication
Practicing communication
Johann Robbertze
 
Historia de la computación
Historia de la computaciónHistoria de la computación
Historia de la computación
Aime Rodriguez
 
10 soc y economia el estado
10 soc y economia el estado10 soc y economia el estado
10 soc y economia el estado
Lucho Canales
 
13 soc y economia macro
13 soc y economia macro13 soc y economia macro
13 soc y economia macro
Lucho Canales
 
11 formac pe tesoreria
11 formac pe   tesoreria11 formac pe   tesoreria
11 formac pe tesoreria
Lucho Canales
 
20120130406008 2-3
20120130406008 2-320120130406008 2-3
20120130406008 2-3
IAEME Publication
 

Viewers also liked (19)

Rutas
RutasRutas
Rutas
 
innovatorsofhyd
innovatorsofhydinnovatorsofhyd
innovatorsofhyd
 
Recognition of poverty (citizenship stage 2)
Recognition of poverty (citizenship stage 2)Recognition of poverty (citizenship stage 2)
Recognition of poverty (citizenship stage 2)
 
ประวัติสุภัสสร61
ประวัติสุภัสสร61ประวัติสุภัสสร61
ประวัติสุภัสสร61
 
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
2012 01 20 (upm) emadrid ramaturana gnoss linked open data aprendizaje tecnol...
 
Tarea de calculo
Tarea de calculoTarea de calculo
Tarea de calculo
 
Açıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
Açıköğretim e-Öğrenme Yapım ve Sunum AltyapısıAçıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
Açıköğretim e-Öğrenme Yapım ve Sunum Altyapısı
 
Sektör Haberleri 21 Aralık 2012
Sektör Haberleri 21 Aralık 2012Sektör Haberleri 21 Aralık 2012
Sektör Haberleri 21 Aralık 2012
 
119
119119
119
 
Chemistry:Air
Chemistry:AirChemistry:Air
Chemistry:Air
 
Многообразие живых организмов
Многообразие живых организмовМногообразие живых организмов
Многообразие живых организмов
 
0910 F 01 Blog
0910 F 01 Blog0910 F 01 Blog
0910 F 01 Blog
 
Warnings
WarningsWarnings
Warnings
 
Practicing communication
Practicing communicationPracticing communication
Practicing communication
 
Historia de la computación
Historia de la computaciónHistoria de la computación
Historia de la computación
 
10 soc y economia el estado
10 soc y economia el estado10 soc y economia el estado
10 soc y economia el estado
 
13 soc y economia macro
13 soc y economia macro13 soc y economia macro
13 soc y economia macro
 
11 formac pe tesoreria
11 formac pe   tesoreria11 formac pe   tesoreria
11 formac pe tesoreria
 
20120130406008 2-3
20120130406008 2-320120130406008 2-3
20120130406008 2-3
 

Similar to Lsr vpresntation

Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
Ahmed AlSum
 
Capture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingCapture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web Archiving
Kristen Yarmey
 
Creating and Maintaining Web Archives
Creating and Maintaining Web ArchivesCreating and Maintaining Web Archives
Creating and Maintaining Web Archives
MARAC Bethlehem PC
 
Drupal and Libraries
Drupal and LibrariesDrupal and Libraries
Drupal and Libraries
Ellyssa Kroski
 
Metadata harvesting Tools
Metadata harvesting ToolsMetadata harvesting Tools
Open the collection doors, please, HAL
Open the collection doors, please, HALOpen the collection doors, please, HAL
Open the collection doors, please, HAL
jpetrusa
 
Realigning library services with e resources (ss)
Realigning library services with e resources (ss)Realigning library services with e resources (ss)
Realigning library services with e resources (ss)
Dhanashree Date
 
Digital Projects in Special Collections
Digital Projects in Special CollectionsDigital Projects in Special Collections
Digital Projects in Special Collections
Trevor Owens
 
web 2.0, library systems and the library system
web 2.0, library systems and the library systemweb 2.0, library systems and the library system
web 2.0, library systems and the library system
lisld
 
Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)
Y. Nicolas
 
Evaluating libraryresourcesamigos
Evaluating libraryresourcesamigosEvaluating libraryresourcesamigos
Evaluating libraryresourcesamigos
Nina McHale
 
Jenkins jr edu600 ip 3 digital research
Jenkins jr edu600 ip 3 digital researchJenkins jr edu600 ip 3 digital research
Jenkins jr edu600 ip 3 digital research
wcjenkinsjr
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
Hong (Jenny) Jing
 
Open the collection doors, please, Hal
Open the collection doors, please, HalOpen the collection doors, please, Hal
Open the collection doors, please, Hal
jpetrusa
 
An Introduction to EZID
An Introduction to EZIDAn Introduction to EZID
Lorcan Dempsey 20080521
Lorcan Dempsey 20080521Lorcan Dempsey 20080521
Lorcan Dempsey 20080521
ent12701
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
iewsxc
 
Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turn
Bonaria Biancu
 
5463 26 web mining
5463 26 web mining5463 26 web mining
Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
Hannes Ebner
 

Similar to Lsr vpresntation (20)

Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
Capture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web ArchivingCapture All the URLS: First Steps in Web Archiving
Capture All the URLS: First Steps in Web Archiving
 
Creating and Maintaining Web Archives
Creating and Maintaining Web ArchivesCreating and Maintaining Web Archives
Creating and Maintaining Web Archives
 
Drupal and Libraries
Drupal and LibrariesDrupal and Libraries
Drupal and Libraries
 
Metadata harvesting Tools
Metadata harvesting ToolsMetadata harvesting Tools
Metadata harvesting Tools
 
Open the collection doors, please, HAL
Open the collection doors, please, HALOpen the collection doors, please, HAL
Open the collection doors, please, HAL
 
Realigning library services with e resources (ss)
Realigning library services with e resources (ss)Realigning library services with e resources (ss)
Realigning library services with e resources (ss)
 
Digital Projects in Special Collections
Digital Projects in Special CollectionsDigital Projects in Special Collections
Digital Projects in Special Collections
 
web 2.0, library systems and the library system
web 2.0, library systems and the library systemweb 2.0, library systems and the library system
web 2.0, library systems and the library system
 
Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)Calames :: CERL seminar (Paris, 2008)
Calames :: CERL seminar (Paris, 2008)
 
Evaluating libraryresourcesamigos
Evaluating libraryresourcesamigosEvaluating libraryresourcesamigos
Evaluating libraryresourcesamigos
 
Jenkins jr edu600 ip 3 digital research
Jenkins jr edu600 ip 3 digital researchJenkins jr edu600 ip 3 digital research
Jenkins jr edu600 ip 3 digital research
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
 
Open the collection doors, please, Hal
Open the collection doors, please, HalOpen the collection doors, please, Hal
Open the collection doors, please, Hal
 
An Introduction to EZID
An Introduction to EZIDAn Introduction to EZID
An Introduction to EZID
 
Lorcan Dempsey 20080521
Lorcan Dempsey 20080521Lorcan Dempsey 20080521
Lorcan Dempsey 20080521
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
Resource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turnResource discovery and information sharing: reaching the 2.0 turn
Resource discovery and information sharing: reaching the 2.0 turn
 
5463 26 web mining
5463 26 web mining5463 26 web mining
5463 26 web mining
 
Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
 

Lsr vpresntation

  • 1. Problems and Issues in Selecting, Harvesting, and Cataloging Web Resources Joanne Archer and John Schalow University of Maryland Libraries
  • 2. Jargon Crawler Web Harvesting Seed Harvest Crawl
  • 4. Options for Web Harvesting In House Program i.e. Pandora, Web Curator Tool Pro: flexibility Con: $$$ i.e. HTTrack, Adobe Web Capture Pro: inexpensive Con: not-scalable Off the Shelf Software Third Party Subscription i.e. Web Archiving Service Archive-It Pro: Ease-of-use Con: $
  • 5. Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
  • 6. Maryland’s Pilot Harvests (2008-2010) Historic Preservation Maryland State Documents
  • 7.
  • 8. Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
  • 10.
  • 11. Single host = www.preservemd.org Multiple hosts = www.umd.edu www.lib.umd.edu
  • 13. End-User Access collection note subject heading general material designation URLs uniform title
  • 14.
  • 15.