UKOLN is supported  by: Approaches to Archiving Professional Blogs Hosted in the Cloud iPRES 2010, Vienna, Austria Tuesday...
Introduction to UKOLN <ul><li>UKOLN is a centre of excellence in digital information management, providing advice and serv...
Why blogs? Why in the Cloud? <ul><li>Ease of creation, ease of use, ease of sharing </li></ul><ul><li>Increasingly used fo...
The Professional’s blog <ul><li>Established 2006 </li></ul><ul><li>750+ posts </li></ul><ul><li>240 users per day </li></u...
The Project blog <ul><li>2008 - 2010 </li></ul><ul><li>118 posts </li></ul><ul><li>141 comments </li></ul><ul><li>6 contri...
The Event blog <ul><li>June – August 2009 </li></ul><ul><li>68 posts </li></ul><ul><li>3 contributors + guests </li></ul><...
Why Preserve blogs? <ul><li>Contain useful information </li></ul><ul><li>Information not available elsewhere </li></ul><ul...
Different Approaches: <ul><li>New Static Master Copy </li></ul><ul><li>Backup Copy </li></ul><ul><li>Migration to Another ...
New Static Master Copy <ul><li>Migrate blog to static HTML </li></ul><ul><li>Point to new static resource </li></ul><ul><l...
Backup Copy <ul><li>Using XML, using HTML? </li></ul><ul><li>Where?  </li></ul><ul><li>On the server? On a disc? On an ext...
Migration to Another Platform <ul><li>Live blog to alternate platform </li></ul><ul><li>Could just be for data mining purp...
Physical Manifestation <ul><li>Create a hard copy print out e.g. self-publishing </li></ul><ul><li>Create PDF of site, RSS...
Technical Approaches <ul><li>HTML Scraping </li></ul><ul><ul><li>HTTTrack – static Web site created </li></ul></ul><ul><li...
Freezing a blog <ul><li>Assessment of status of blog </li></ul><ul><li>Audit - Get your house in order: links to embeds, c...
The Archive Page
General Issues <ul><li>What constitutes a blog? – content, layout, plugins, comments, tags, images, multimedia, etc. </li>...
Best Practice Checklist <ul><li>Planning </li></ul><ul><li>Clarification of rights </li></ul><ul><li>Monitoring of technol...
Lessons Learnt <ul><li>Need for a risk assessment framework if using third party services </li></ul><ul><li>Importance of ...
Questions? <ul><li>Twitter Id: mariekeguy </li></ul><ul><li>Email: m.guy@ukoln.ac.uk </li></ul><ul><li>Slides: http://www....
Upcoming SlideShare
Loading in...5
×

Approaches to Archiving Professional Blogs Hosted in the Cloud

3,592

Published on

'Approaches to Archiving Professional Blogs
Hosted in the Cloud' presentation given by Marieke Guy, UKOLN on September 21, 2010 at the 7th International Conference on Preservation of Digital Objects (iPRES2010), Vienna, Austria. Available at http://www.ukoln.ac.uk/web-focus/papers/pres-2010/paper25/

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,592
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Approaches to Archiving Professional Blogs Hosted in the Cloud

  1. 1. UKOLN is supported by: Approaches to Archiving Professional Blogs Hosted in the Cloud iPRES 2010, Vienna, Austria Tuesday, September 21st 2010 Marieke Guy Research Officer, UKOLN www.bath.ac.uk This work is licensed under a Attribution-NonCommercial-ShareAlike 2.0 licence http://www.ukoln.ac.uk/web-focus/papers/pres-2010/paper25/
  2. 2. Introduction to UKOLN <ul><li>UKOLN is a centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities </li></ul><ul><li>Library and cataloguing background </li></ul><ul><li>Located at the University of Bath, UK </li></ul><ul><li>Funded by JISC to advise UK HE and FE communities </li></ul><ul><li>Also project funding, including EU funding </li></ul><ul><li>Many areas of work including metadata, repositories, dissemination activities, eScience, etc. </li></ul><ul><li>Digital preservation projects: DRIVER, CEDARS, eBank, JISC Preservation of Web Resources, Beginners Guide, etc. </li></ul><ul><li>Digital Curation Centre </li></ul>
  3. 3. Why blogs? Why in the Cloud? <ul><li>Ease of creation, ease of use, ease of sharing </li></ul><ul><li>Increasingly used for reflecting, analyzing, questioning, critiquing, recording, discussing, learning, etc. </li></ul><ul><li>Very important for information professionals </li></ul><ul><li>Many dissemination benefits </li></ul><ul><li>Lack of institutional blogging infrastructure </li></ul><ul><li>UKOLN supports innovation </li></ul><ul><li>Cloud is an agile, cost-effective, highly useable way to deliver a service </li></ul><ul><li>Now own institutional service and over 15 blogs </li></ul>
  4. 4. The Professional’s blog <ul><li>Established 2006 </li></ul><ul><li>750+ posts </li></ul><ul><li>240 users per day </li></ul><ul><li>Personal style </li></ul><ul><li>Institution vs individual? </li></ul>
  5. 5. The Project blog <ul><li>2008 - 2010 </li></ul><ul><li>118 posts </li></ul><ul><li>141 comments </li></ul><ul><li>6 contributors </li></ul><ul><li>Professional style </li></ul>
  6. 6. The Event blog <ul><li>June – August 2009 </li></ul><ul><li>68 posts </li></ul><ul><li>3 contributors + guests </li></ul><ul><li>Video, interviews, photos, discussion </li></ul><ul><li>Informal/professional style </li></ul>
  7. 7. Why Preserve blogs? <ul><li>Contain useful information </li></ul><ul><li>Information not available elsewhere </li></ul><ul><li>Look and feel relevant </li></ul><ul><li>Cultural significance </li></ul><ul><li>Reliance on 3 rd party services </li></ul><ul><li>Blogs disappear (UK HE funding cuts…) </li></ul><ul><li>‘ Archiving’ - ways in which blog content can be migrated to alternative environments in order to satisfy a number of business functions </li></ul><ul><li>Focus on short-term continuity and management </li></ul><ul><li>Could comprise part of a preservation Strategy </li></ul>
  8. 8. Different Approaches: <ul><li>New Static Master Copy </li></ul><ul><li>Backup Copy </li></ul><ul><li>Migration to Another Platform </li></ul><ul><li>Physical Manifestation </li></ul><ul><li>Other technical approaches </li></ul><ul><li>What are the issues with each of these? </li></ul>http://www.flickr.com/photos/mnsc/433436548/
  9. 9. New Static Master Copy <ul><li>Migrate blog to static HTML </li></ul><ul><li>Point to new static resource </li></ul><ul><li>IWMW – WinHTTTrack static copy </li></ul><ul><li>Issues: </li></ul><ul><li>No interactivity </li></ul><ul><li>Loss of technical architecture e.g. plugins </li></ul><ul><li>Loss of other elements e.g comments </li></ul><ul><li>Look and feel </li></ul>
  10. 10. Backup Copy <ul><li>Using XML, using HTML? </li></ul><ul><li>Where? </li></ul><ul><li>On the server? On a disc? On an external hard drive? </li></ul><ul><li>On the same blog platform? </li></ul><ul><li>ArchivePress </li></ul><ul><li>On alternate blog platform? </li></ul><ul><li>JP XML version on Intranet </li></ul><ul><li>IWMW static version on Intranet </li></ul><ul><li>Issues: </li></ul><ul><li>Access </li></ul>
  11. 11. Migration to Another Platform <ul><li>Live blog to alternate platform </li></ul><ul><li>Could just be for data mining purposes – can’t do on current environment </li></ul><ul><li>UKWF  VOX platform, RSS feeds used, Yahoo pipes </li></ul><ul><li>Export feature </li></ul><ul><li>Issues: </li></ul><ul><li>Access </li></ul><ul><li>Loss of technical architecture e.g. plugins </li></ul><ul><li>Loss of other elements e.g comments </li></ul><ul><li>Look and feel </li></ul>
  12. 12. Physical Manifestation <ul><li>Create a hard copy print out e.g. self-publishing </li></ul><ul><li>Create PDF of site, RSS2PDF </li></ul><ul><li>UKWF Lulu self published book available </li></ul><ul><li>Purpose specific </li></ul><ul><li>Issues </li></ul><ul><li>Obviously not interactive but record unlikely to degrade like other options </li></ul>
  13. 13. Technical Approaches <ul><li>HTML Scraping </li></ul><ul><ul><li>HTTTrack – static Web site created </li></ul></ul><ul><li>Third-party Web archiving </li></ul><ul><ul><li>UK Web Archive </li></ul></ul><ul><ul><li>Internet Archive </li></ul></ul><ul><ul><li>Not always complete capture but useful for look and feel </li></ul></ul><ul><ul><li>URL submitted for case study blogs </li></ul></ul>
  14. 14. Freezing a blog <ul><li>Assessment of status of blog </li></ul><ul><li>Audit - Get your house in order: links to embeds, comments, spam, etc. </li></ul><ul><li>Preliminary posts </li></ul><ul><li>Statistics: dates, posts, comments, spam, contributors, theme, plugins, software, licence etc. </li></ul><ul><li>Archive page/sidebar widget </li></ul><ul><li>Final post </li></ul><ul><li>Indication that blog is archived </li></ul><ul><li>Close comments </li></ul><ul><li>Archive blog </li></ul>http://www.flickr.com/photos/plousia/93646438/
  15. 15. The Archive Page
  16. 16. General Issues <ul><li>What constitutes a blog? – content, layout, plugins, comments, tags, images, multimedia, etc. </li></ul><ul><li>Who owns a blog? </li></ul><ul><li>Identity, copyright, ownership and licences </li></ul><ul><li>Privacy </li></ul><ul><li>Permissions to access blogs belonging to individuals </li></ul><ul><li>Understandability of pages if out of context </li></ul><ul><li>Blog policies </li></ul><ul><li>Availability </li></ul>
  17. 17. Best Practice Checklist <ul><li>Planning </li></ul><ul><li>Clarification of rights </li></ul><ul><li>Monitoring of technologies used </li></ul><ul><li>Auditing </li></ul><ul><li>Understanding of costs and benefits </li></ul><ul><li>Identification and implementation of archiving strategy </li></ul><ul><li>Dissemination </li></ul><ul><li>Learning </li></ul><ul><li>Organisational Audit </li></ul>
  18. 18. Lessons Learnt <ul><li>Need for a risk assessment framework if using third party services </li></ul><ul><li>Importance of planning and writing of blog policy at start of blog lifecycle </li></ul><ul><li>Useful to consider a combination of approaches rather than just one </li></ul><ul><li>Value of sharing best practice of blog archiving </li></ul>
  19. 19. Questions? <ul><li>Twitter Id: mariekeguy </li></ul><ul><li>Email: m.guy@ukoln.ac.uk </li></ul><ul><li>Slides: http://www.slideshare.net/MariekeGuy </li></ul><ul><li>All resource URLs tagged with ipres2010-blogs: http://delicious.com/mariekeguy/ipres2010-blogs </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×