Approaches to Archiving Professional Blogs Hosted in the Cloud
Upcoming SlideShare
Loading in...5

Approaches to Archiving Professional Blogs Hosted in the Cloud



'Approaches to Archiving Professional Blogs ...

'Approaches to Archiving Professional Blogs
Hosted in the Cloud' presentation given by Marieke Guy, UKOLN on September 21, 2010 at the 7th International Conference on Preservation of Digital Objects (iPRES2010), Vienna, Austria. Available at



Total Views
Views on SlideShare
Embed Views



8 Embeds 728 354 251 111 4 3 3 1 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Approaches to Archiving Professional Blogs Hosted in the Cloud Approaches to Archiving Professional Blogs Hosted in the Cloud Presentation Transcript

  • UKOLN is supported by: Approaches to Archiving Professional Blogs Hosted in the Cloud iPRES 2010, Vienna, Austria Tuesday, September 21st 2010 Marieke Guy Research Officer, UKOLN This work is licensed under a Attribution-NonCommercial-ShareAlike 2.0 licence
  • Introduction to UKOLN
    • UKOLN is a centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities
    • Library and cataloguing background
    • Located at the University of Bath, UK
    • Funded by JISC to advise UK HE and FE communities
    • Also project funding, including EU funding
    • Many areas of work including metadata, repositories, dissemination activities, eScience, etc.
    • Digital preservation projects: DRIVER, CEDARS, eBank, JISC Preservation of Web Resources, Beginners Guide, etc.
    • Digital Curation Centre
  • Why blogs? Why in the Cloud?
    • Ease of creation, ease of use, ease of sharing
    • Increasingly used for reflecting, analyzing, questioning, critiquing, recording, discussing, learning, etc.
    • Very important for information professionals
    • Many dissemination benefits
    • Lack of institutional blogging infrastructure
    • UKOLN supports innovation
    • Cloud is an agile, cost-effective, highly useable way to deliver a service
    • Now own institutional service and over 15 blogs
  • The Professional’s blog
    • Established 2006
    • 750+ posts
    • 240 users per day
    • Personal style
    • Institution vs individual?
  • The Project blog
    • 2008 - 2010
    • 118 posts
    • 6 contributors
    • Professional style
  • The Event blog
    • June – August 2009
    • 68 posts
    • 3 contributors + guests
    • Video, interviews, photos, discussion
    • Informal/professional style
  • Why Preserve blogs?
    • Contain useful information
    • Information not available elsewhere
    • Look and feel relevant
    • Cultural significance
    • Reliance on 3 rd party services
    • Blogs disappear (UK HE funding cuts…)
    • ‘ Archiving’ - ways in which blog content can be migrated to alternative environments in order to satisfy a number of business functions
    • Focus on short-term continuity and management
    • Could comprise part of a preservation Strategy
  • Different Approaches:
    • New Static Master Copy
    • Backup Copy
    • Migration to Another Platform
    • Physical Manifestation
    • Other technical approaches
    • What are the issues with each of these?
  • New Static Master Copy
    • Migrate blog to static HTML
    • Point to new static resource
    • IWMW – WinHTTTrack static copy
    • Issues:
    • No interactivity
    • Loss of technical architecture e.g. plugins
    • Loss of other elements e.g comments
    • Look and feel
  • Backup Copy
    • Using XML, using HTML?
    • Where?
    • On the server? On a disc? On an external hard drive?
    • On the same blog platform?
    • ArchivePress
    • On alternate blog platform?
    • JP XML version on Intranet
    • IWMW static version on Intranet
    • Issues:
    • Access
  • Migration to Another Platform
    • Live blog to alternate platform
    • Could just be for data mining purposes – can’t do on current environment
    • UKWF  VOX platform, RSS feeds used, Yahoo pipes
    • Export feature
    • Issues:
    • Access
    • Loss of technical architecture e.g. plugins
    • Loss of other elements e.g comments
    • Look and feel
  • Physical Manifestation
    • Create a hard copy print out e.g. self-publishing
    • Create PDF of site, RSS2PDF
    • UKWF Lulu self published book available
    • Purpose specific
    • Issues
    • Obviously not interactive but record unlikely to degrade like other options
  • Technical Approaches
    • HTML Scraping
      • HTTTrack – static Web site created
    • Third-party Web archiving
      • UK Web Archive
      • Internet Archive
      • Not always complete capture but useful for look and feel
      • URL submitted for case study blogs
  • Freezing a blog
    • Assessment of status of blog
    • Audit - Get your house in order: links to embeds, comments, spam, etc.
    • Preliminary posts
    • Statistics: dates, posts, comments, spam, contributors, theme, plugins, software, licence etc.
    • Archive page/sidebar widget
    • Final post
    • Indication that blog is archived
    • Close comments
    • Archive blog
  • The Archive Page
  • General Issues
    • What constitutes a blog? – content, layout, plugins, comments, tags, images, multimedia, etc.
    • Who owns a blog?
    • Identity, copyright, ownership and licences
    • Privacy
    • Permissions to access blogs belonging to individuals
    • Understandability of pages if out of context
    • Blog policies
    • Availability
  • Best Practice Checklist
    • Planning
    • Clarification of rights
    • Monitoring of technologies used
    • Auditing
    • Understanding of costs and benefits
    • Identification and implementation of archiving strategy
    • Dissemination
    • Learning
    • Organisational Audit
  • Lessons Learnt
    • Need for a risk assessment framework if using third party services
    • Importance of planning and writing of blog policy at start of blog lifecycle
    • Useful to consider a combination of approaches rather than just one
    • Value of sharing best practice of blog archiving
  • Questions?
    • Twitter Id: mariekeguy
    • Email:
    • Slides:
    • All resource URLs tagged with ipres2010-blogs: