Your SlideShare is downloading. ×
  • Like
UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

  • 1,356 views
Published

 

Published in Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,356
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. UK LOCKSS AllianceToday’s scholarly content, secured for tomorrowAdam RusbridgeUK LOCKSS Alliance CoordinatorEDINA, University of Edinburgh8th March 2012Trust and eJournals Workshop, London
  • 2. Summary • LOCKSS: Digital equivalent of the physical shelf • Sufficient rights to access content as needed • Financial control and governance over systems • Automate preservation functions where possible • LOCKSS provides generic preservation capacity • Customise the distributed architecture according to community needs • Modeling the total cost of long-term storagewww.flickr.com/photos/guitarlogy/5387073471/
  • 3. Community Action for Assured Access A co-operative organization to ensure continuing sustainable access to scholarly work over the long term. UK libraries are collaborating to build national ‘network level’ infrastructure and to coordinate the preservation of electronic material of local and UK interest. (since 2008) 17 member institutions Steering Committee directs activity (next De Montfort University meeting May 2012) King’s College London London School of Economics Natural History Museum Phil Adams (De Montfort University) Open University Lisa Cardy (London School of Economics) Royal Holloway, University of London University of Birmingham Geoff Gilbert (University of Birmingham) University of Edinburgh Tony Kidd (University of Glasgow) University of Glasgow University of Hertfordshire Liz Stevenson (University of Edinburgh) University of Huddersfield Lorraine Estelle (JISC Collections) University of Newcastle Upon Tyne University of Oxford Peter Burnhill (EDINA) University of Salford Adam Rusbridge (EDINA) University of St. Andrews University of Warwick University of YorkSupport Service at EDINA provides underlying coordination, support and developmentJISC Collections organises membership subscriptions and gives guidance and supportJISC prompted the initial project led by the Digital Curation Centre (2006-08)
  • 4. Technical Infrastructure- Preserves content as published - Preserve the record: web archiving - Fetches content from a server- Preserves integrity - Audit protocol to prevent damage - Tamper resistant- Avoids single point of failure - Distributed network to avoid points of failure - Model on success of print collections (and operation of the library)
  • 5. Technical Infrastructure- Preserves content as published - Preserve the record: web archiving - Fetches content from a server- Preserves integrity - Audit protocol to prevent damage - Tamper resistant- Avoids single point of failure - Distributed network to avoid points of failure - Model on success of print collections (and operation of the library)
  • 6. Technical Infrastructure- Preserves content as published - Preserve the record: web archiving - Fetches content from a server- Preserves integrity - Audit protocol to prevent damage - Tamper resistant- Avoids single point of failure - Distributed network to avoid points of failure - Model on success of print collections (and operation of the library)
  • 7. MetaArchive• A distributed digital preservation solution depends on a collaborating set of institutions agreeing to preserve each other’s content. • Requires central coordination; shared enthusiasm, resources and benefit • Successful models initiated where community / shared need already in place.• MetaArchive is a cooperative not a vendor (conceived 2004) • Goal is not to make profits, but to improve each members situation. • Distribute across geography: diversify funding, politics, economy • Replicate content, lower barriers of entry• Educopia Institute - non-profit administrative organisation • Coordination role; arrange legal agreements and commitments to preserve member content • Sustained by affordable cooperative fee memberships set by members • Supplemented by grants and contracts
  • 8. Costs • Equipment • Each institution required to contribute a server to the network • As of June 2011: $4,600 for a 16TB machine • Staffing • 2% of a systems administrator’s time • Administrator/point of contact • Software engineer who preps content for ingest • (latter two roles needed for outsourced solution). • Storage • $1.00/GB/year for content stored in the network. • ‘Conspectus’ to organise where content is stored
  • 9. Tiered Membership • Sustaining Members: $5,500/year • Leadership, development, governance • Preservation Members: $3,000/year • Benefit from shared preservation model • Collaborative Members: (varies, but e.g. $4,000/year for 20) • Consortia that share a server, and so look like one organisation. • Allows existing consortia to preserve co-hosted content for a fraction of what it would cost to do so as individual members.
  • 10. PLNs in the UK: Member Survey• Share resources & responsibility, build community, keep costs low • Preservation policy • Content and Collections • Organisational architecture • Costs and Resources
  • 11. Initial Conclusions• Survey response rate: 50% of members• Institutions seeking affordable solutions to digital preservation. • e-preservation strategies have yet to be developed.• Extent of digital assets requiring preservation unknown • Systematic audits have not yet been carried out.• Prefer architecture where content is stored at more than one location • However a fully distributed approach was not favoured.• Mixed enthusiasm for a PLN • Need to demonstrate PLN is low-cost and sustainable • Need clear and demonstrable financial benefits • Need a shared interest in preserving a particular body or type of content. • Difficult to gain acceptance and commitment without these benefits• Moving forward: establish a UK PLN, or join the MetaArchive as a Collaborative Member?
  • 12. Pricing of Long-term Cloud Storage• David Rosenthal has been looking at cost models for long-term Year Cost storage: http://blog.dshr.org per GB• Does it make economic sense to store data in the cloud, in the 2002 $3.98 long-term. 2004 $1.25• Kryders Law, 30yr history of exponential increase in disk 2006 $0.64 capacity at roughly constant cost. 2008 $0.29• The cost of storing bits for the long term depends on current price and how fast it is dropping. 2010 $0.08? 2012 $0.06?• How long can we expect Kryders Law to continue?• Indications that Kryders Law is slowing down • 4TB disks now available, but slower than expected. • Driver for 3.5" disks has been desktop PCs. Volume market is now 2.5" disks: same curve but higher price/ byte. • By 2020, ought to have 14TB 2.5" drive @ $40 • Consumers may prefer a 2TB 1" drive for $15 and less power draw
  • 13. Cloud Storage Price HistoryProvider Price (Year of Current Price % decrease Launch)Amazon S3 $0.15/GB/mo $0.125/GB/mo 3%/yr (2006)RackSpace $0.15/GB/mo $0.15/GB/mo 0%/yr (2008)Windows Azure $0.15/GB/mo $0.14/GB/mo 3%/yr (2009) http://blog.dshr.org/2012/02/cloud-storage-pricing-history.html • Price of cloud storage is dropping around an order of magnitude more slowly than raw disk prices. • There is a recurrent cost for storage in the cloud. As collections grow, will the cost of cloud storage grow more than if performed locally? • Research to model total costs over time – local hardware, maintenance, location, power, bandwidth, staffing.
  • 14. Cost to preserve 8TB • Starting with S3s current pricing and assuming that it continues to drop at 3%/yr, the total cost over 4 years would be $41,065. • DIY: 3 geographically separate complete copies each protected against double disk failures. • Three Drobo FS network file servers ($600 each at Amazon) populated with 5 3TB Hitachi 5400RPM drives ($210 each at Newegg). Add one spare for each Drobo to cover while failed drives are returned under warranty. Capital cost of $5580. • Each Drobo consumes ~70W with all drives active. So wed consume 1840 KWh over 4 years. Palo Alto Green rates, cost of $250. • Stanford experience with Drobos is that almost no attention needed, but assume staff costs at $50/hr for 1hr/mo/box = $7200. • The total cost over 4 years would be $13,030: around a third of the total cost of S3. http://blog.dshr.org/2012/02/cloud-storage-pricing-history.html
  • 15. Principles of LOCKSS: Building Trusted Archives • LOCKSS software can be used to provide general, shared preservation capacity • Responsibility spread across the community • Shepherded by strong universities with strong collection policies • Further assessment of UK Private LOCKSS Networks • Model selected depends on scale of content & community enthusiasm • Further assessment to understand the total cost of storagewww.flickr.com/photos/guitarlogy/5387073471/
  • 16. Find out more… http://www.lockssalliance.ac.uk a.rusbridge@ed.ac.uk @EDINA_eJournals