Your SlideShare is downloading. ×
English
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

English

219
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
219
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Library and Archives Canada’s Web Archiving Program Presentation to the International Internet Preservation Consortium General Assembly Open Session - May 5, 2009 Gillian Cantello A/Director General Published Heritage Branch
  • 2. Purpose of LAC’s Web Archiving Program
    • To acquire, preserve and make accessible knowledge and information from the Canadian Internet for current and future generations of Canadians
  • 3. Collection Development Policy for Websites
    • LAC’s website selection guidelines form part of its Digital Collection Development Policy http://www.collectionscanada.gc.ca/collection/003-200-e.html
    • Two-pronged approach:
      • Individual capture of websites
      • Domain or thematic harvests
  • 4. Websites/Domains in LAC’s Collection
    • Acquired using Heritrix software:
      • Government of Canada web domain: 2005-2006, 2007, 2008
      • Provincial/Territorial web domains: 2006, 2008
      • Federal Elections: 2006, 2008
      • Provincial Elections: Alberta, Quebec 2008; Newfoundland, Northwest Territories, Ontario, Saskatchewan 2007,
      • Olympic & Paralympic Games: 2006, 2008
    • Acquired using MetaPro software
      • Selected individual sites – Added to LAC’s E-Collection
  • 5. Artists Online Website Accessible in the Electronic Collection and AMICUS
  • 6. Government of Canada Web Archive Search Interface
  • 7. WebCan
    • Developed by LAC to allow acquisitions staff to manage all aspects of web harvesting: seed lists, crawls, QA & indexing
    • To be released as open source
  • 8. Vancouver 2010 Olympic and Paralympic Winter Games
    • Will provide an archive of significant websites associated with the Vancouver 2010 Olympic and Paralympic Games
    • Based on NLA model for Sydney Olympics
    • Partnership with Department of Canadian Heritage
    • Two test crawls completed so far
  • 9. National Research Council Project
    • Data mining research partnership with NRC
    • Content from 3 rd crawl of Government of Canada Web Archive will be used for NRC research on bilingual machine translation
    • NRC researchers will advise LAC on automated means of enhancing preservation and access of content in the archive.
  • 10. Plans for 2009-2010
    • Acquisitions
      • Fourth crawl .gc.ca domain
      • Third crawl provincial/territorial government web domains
      • Thematic/topical websites still to be selected
      • Vancouver 2010 Olympics
    • Non-GC thematic crawls to be made accessible

×