Lsr vpresntation

349 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
349
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lsr vpresntation

  1. 1. Problems and Issues in Selecting, Harvesting, and Cataloging Web Resources Joanne Archer and John Schalow University of Maryland Libraries
  2. 2. Jargon Crawler Web Harvesting Seed Harvest Crawl
  3. 3. Wayback Machine
  4. 4. Options for Web Harvesting In House Program i.e. Pandora, Web Curator Tool Pro: flexibility Con: $$$ i.e. HTTrack, Adobe Web Capture Pro: inexpensive Con: not-scalable Off the Shelf Software Third Party Subscription i.e. Web Archiving Service Archive-It Pro: Ease-of-use Con: $
  5. 5. Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
  6. 6. Maryland’s Pilot Harvests (2008-2010) Historic Preservation Maryland State Documents
  7. 7. Why harvest these areas? <ul><ul><li>Collections are unique </li></ul></ul><ul><ul><li>Builds on existing strengths in print collections </li></ul></ul><ul><ul><li>Large amount of material migrating to the web </li></ul></ul>
  8. 8. Key Questions for Harvesting Projects uniqueness ephemerality research value harvest frequency scope
  9. 9. Harvesting
  10. 10. Harvesting Challenges: <ul><li>Javascript </li></ul><ul><li>Streaming media </li></ul><ul><li>Form and database driven content </li></ul><ul><li>Password protected sites </li></ul><ul><li>Robot.txt files </li></ul><ul><li>Multiple hosts/subdomains </li></ul>
  11. 11. Single host = www.preservemd.org Multiple hosts = www.umd.edu www.lib.umd.edu
  12. 12. End-User Access
  13. 13. End-User Access collection note subject heading general material designation URLs uniform title
  14. 14. Conclusions <ul><li>Challenges </li></ul><ul><li>Start up costs </li></ul><ul><li>What to collect </li></ul><ul><li>Metadata creation </li></ul>BUT We are well prepared to meet the challenges
  15. 15. Questions? <ul><li>Joanne Archer: jarcher@umd.edu </li></ul><ul><li>John Schalow: schalow@umd.edu </li></ul>

×