Cherie Edmonds
Catherine Hrbal
Chisheng Li
Main Themes
York
Challenges in developing a large-scale digitzation effort -
looking at HathiTrust
HathiTrust Website
Curr...
Important Terms
Barriers to Entry - tangible costs and challenges to start a project
Benign Neglect - choosing not to focu...
Important Terms cont.
Negative Benefits - looking at the cost of NOT preserving as
incentive to do digital preservation
No...
Blue Ribbon Task Force
There is a disruption of roles and
responsibilities among players, resulting
from the non-rivalrous...
Scholarly Discourse
Misaligned incentives
• Publishers
o have high incentives to preserve these
materials
o have shown lit...
Question:
Do you think collective bargaining would be effective
to help lower barriers to entry for emerging authors?
Do y...
Free rider problem
 large startup costs create barriers to entry
 no one wants the be the first mover
Uncertain future v...
Action Agenda for
Scholarly Discourse
1. Libraries, scholars, and professional societies should develop
selection criteria...
Research Data
Research data vary enormously
Often a need to preserve ancillary materials, such
as lab notebooks
Seconda...
 In grant-funded research, preservation is
framed as a zero-sum game
 Imposition of mandates will strengthen
incentives
...
Action Agenda for Research Data
1. Each domain, through professional societies or other
consensus making bodies, should se...
Commercially Owned
Cultural Content
Who owns it? Misalignment between owners
and controllers of digital content arise in
a...
Must strengthen the rights of preserving institutions by revising copyright
law
o mandate deposit of copyrighted electroni...
Question:
Could commercial sponsorship of preservation
activities be feasible? What might some of the
tradeoffs be?
Action Agenda for Commercially
Owned Cultural Content
1. Leading cultural organizations should convene expert
communities ...
Collectively Produced Web Content
No clarity about what specific content should be collected
Institutions that are already...
Create public policies and or
partnerships to enable grassroots efforts
at preservation
Collective action will be needed t...
Action Agenda for Collectively
Produced Web Content
1. Leading stewardship organizations should convene stakeholders
and e...
Blue Ribbon Chapter 5
 Which digital content to preserve, for how long, and for what use?
 Who should be in charge?
 Ho...
Principle of actions
1. Create contingency plan for actions to preserve in advance
 prevent risk of losing digital assets...
Principle of actions
4. Prioritize the digital collections based on projected future use
 careful selection of which digi...
Near-term priorities
Organizational action
 form public-private partnerships
 ensure organizations have required experti...
York and HathiTrust Website
Looks at the development of the HathiTrust
Challenges of a large-scale digital
preservation in...
Question:
Looking at these areas, what are some of the major
challenges you think might come about in developing
digital p...
Challenges in Governance
Types of BAD collaboration
o "Goal Drift"
o No buy-in from administrative bodies
Tension
o Percep...
Challenges in Finance
Funding Downfalls
o Voluntary Membership - Potential dissolution of
the partnership
o Minimal Fundin...
Challenges with the Repository
Trusted Environment
o Trusted Digital Repository Certification did not
exist
o Time and Cos...
Challenges with Services
Basic Access
o Print-disabled users
o Compliance with accessibility standards
Search
o No interfa...
Critical Observations
Governance
Duties of new governing bodies not explained in
much detail
Finance
Failed to look at fun...
Walter – Cost of Not Preserving
Cost of digital preservation = 'benign neglect'; cultural either choose to preserve
today,...
The MetaArchive Cooperative Model
 Founded in 2003 as a community-owned & operated digital
preservation network
 Coopera...
MetaArchive Members
 >50 institutions in 13 states & 4 countries
MetaArchive Cost
1. Establish the 1st private LOCKSS network (with NDIIPP funding)
2. Transform into a sustainable 501c3 c...
Current cost
Basic costs:
 Equipment = $4600 for a server
 Staffing = 2% of a systems administrator’s time, software eng...
Upcoming SlideShare
Loading in …5
×

Multi-organizational frameworks for digital information sustainability

788 views
679 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
788
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Multi-organizational frameworks for digital information sustainability

  1. 1. Cherie Edmonds Catherine Hrbal Chisheng Li
  2. 2. Main Themes York Challenges in developing a large-scale digitzation effort - looking at HathiTrust HathiTrust Website Current governance and cost of digital preservation efforts Walter & Skinner Creates incentive by looking at costs of NOT preserving - MetaArchive Cooperative as community-operated model Blue Ribbon Task Force Report - Chapters 4 & 5 Challenges behind a digital preservation project, talks about incentives, and gives simple recommendations and pragmatic small steps you can take to get a project running
  3. 3. Important Terms Barriers to Entry - tangible costs and challenges to start a project Benign Neglect - choosing not to focus on preservation today Contributing Partner - contribute content and pay infrastructure costs for deposited content Committee on Institutional Cooperation (CIC) - includes the Big Ten schools and University of Chicago Economy of Scale - cost advantage to lower both the average and marginal costs of preservation in a large repository MetaArchive Cooperative - community-owned and operated distributed digital preservation network Misaligned Incentives –each participant in a transaction has their own incentives to act. Each party’s incentives are not the same, and sometimes they conflict.
  4. 4. Important Terms cont. Negative Benefits - looking at the cost of NOT preserving as incentive to do digital preservation Non-Exclusive License - rights of the authors to deposit publications into third party repositories Sustaining Partner - participate in curation and management, but do not necessarily contribute content Trusted Digital Repository - certified by TRAC or DRAMBORA whose criteria are based off metadata and formatting standards and best practices Uncertain Future Value - intangible long-term benefits and costs, unable to gauge the benefits for future Zero-Sum Activity - time and money invested into preservation is taken directly from other activities
  5. 5. Blue Ribbon Task Force There is a disruption of roles and responsibilities among players, resulting from the non-rivalrous nature of digital information Chapter four outlined four types of digital information and the challenges and proposed recommendations for each
  6. 6. Scholarly Discourse Misaligned incentives • Publishers o have high incentives to preserve these materials o have shown little resistance to participating in dark archive models • Authors/ Creators o should stipulate perpetual, non exclusive license to their works o collective bargaining to secure these rights o individual use of these licenses could lower barriers to preservation of emerging literature • Libraries o mediation
  7. 7. Question: Do you think collective bargaining would be effective to help lower barriers to entry for emerging authors? Do you think this tactic is feasible?
  8. 8. Free rider problem  large startup costs create barriers to entry  no one wants the be the first mover Uncertain future value  secondary and tertiary uses of the information Funding the current models exclude a significant portion of the scholarly community:  smaller publishers  under-resourced fields  independent scholars and the commercial sector
  9. 9. Action Agenda for Scholarly Discourse 1. Libraries, scholars, and professional societies should develop selection criteria for emerging genres in scholarly discourse, and prototype preservation and access strategies to support them. 2. Publishers reserving the right to preserve should partner with third- party archives or libraries to ensure long-term preservation. 3. Scholars should consider granting nonexclusive rights to publish and preserve, to enable decentralized and distributed preservation of emerging scholarly discourse. 4. Libraries should create a mechanism to organize and clarify their governance issues and responsibilities to preserve monographs and emerging scholarly discourse along lines similar to those for e- journals. 5. All open-access strategies that assume the persistence of information over time must consider provisions for the funding of preservation.
  10. 10. Research Data Research data vary enormously Often a need to preserve ancillary materials, such as lab notebooks Secondary uses of public research data suggest a new users willing to support long-term access to the data Preservation societies and other proxy organizations can play crucial roles in selection for preservation
  11. 11.  In grant-funded research, preservation is framed as a zero-sum game  Imposition of mandates will strengthen incentives • clear allocation of funds ( via a portion of the grant) • clear selection criteria  Funders should be seeding capacity  Subscription models help mitiagte the free- rider problem  There should be agreements in place between the data community and third- party archives
  12. 12. Action Agenda for Research Data 1. Each domain, through professional societies or other consensus making bodies, should set priorities for data selection, level of curation, and length of retention. 2. Funders should impose preservation mandates, when appropriate. When mandates are imposed, funders should also specify selection criteria, funds to be used, and responsible organizations to provide archiving. 3. Funding agencies should explicitly recognize “data under stewardship” as a core indicator of scientific effort and include this information in standard reporting mechanisms. 4. Preservation services should reduce curation and archiving costs by leveraging economies of scale when possible. 5. Agreements with third-party archives should stipulate processes, outcomes, retention periods, and handoff triggers.
  13. 13. Commercially Owned Cultural Content Who owns it? Misalignment between owners and controllers of digital content arise in almost every case Creates widespread disruption of business models that provide the primary incentives for commercial owners to preserve
  14. 14. Must strengthen the rights of preserving institutions by revising copyright law o mandate deposit of copyrighted electronic content into authorize public institutions to secure their lone-term preservation o provide incentives directly to private owners of cultural assets to preserve on the public's behalf o commercial sponsorship of preservation activities and public-private partnerships o stewardship organizations should begin selecting privately held materials of signigicant cultural value
  15. 15. Question: Could commercial sponsorship of preservation activities be feasible? What might some of the tradeoffs be?
  16. 16. Action Agenda for Commercially Owned Cultural Content 1. Leading cultural organizations should convene expert communities to address the selection and preservation needs of commercially owned cultural content and digital orphans. 2. Regulatory authorities should bring current requirements for mandatory copyright deposit into harmony with the demands of digital preservation and access. 3. Regulatory authorities should provide financial and other incentives to preserve privately held cultural content in the public interest. 4. Leading stewardship organizations should model and test mechanisms to ensure flexible long-term public-private partnerships that foster cooperative preservation of privately held materials in the public interest.
  17. 17. Collectively Produced Web Content No clarity about what specific content should be collected Institutions that are already crawling the web should provide leadership to others Collective content may be a composite of linked product with compound rights within them o Bloggers may use some sort of license to clarify whether they want their material archived o Provide incentives for the hosting sites to preserve o develop partnerships between hosting sites and stewardship institutions o grant stewardship institutions the legal authority to crawl the web for preservation purposes
  18. 18. Create public policies and or partnerships to enable grassroots efforts at preservation Collective action will be needed to secure these assets • public funding • public mandates
  19. 19. Action Agenda for Collectively Produced Web Content 1. Leading stewardship organizations should convene stakeholders and experts to address the selection and preservation needs of collectively produced Web content. 2. Creators, contributors, and host sites could lower barriers to third party archiving by using a default license to grant nonexclusive rights for archiving. 3. Regulatory authorities should create incentives, such as preservation subsidies, for host sites to preserve their own content or seek third- party archives as preservation partners. 4. Regulatory authorities should take expeditious action to reform legislation to grant authority to stewardship institutions to preserve at-risk Web content. 5. Leading stewardship organizations should develop partnerships with one or more major content providers to explore the technical, legal, and financial dimensions of long-term preservation.
  20. 20. Blue Ribbon Chapter 5  Which digital content to preserve, for how long, and for what use?  Who should be in charge?  How to secure funding and resources?  How to determine the return of investment? Necessary conditions for sustainable digital preservation: 1. recognition of the benefits of preservation 2. choosing the materials that have long-term value 3. incentives to act in the public interest 4. appropriate governance to oversee the activities 5. ongoing effort to preserve 6. timely actions to ensure access
  21. 21. Principle of actions 1. Create contingency plan for actions to preserve in advance  prevent risk of losing digital assets, and entrust the materials to a responsible party  set up mechanisms (eg. MOUs) to prompt regular review of preservation priorities 2. Argue for a need to invest in preservation  emphasize the gains on possible usage of digital assets, especially short-term  also argue about the cost of not preserving the assets eg. losing clinical trial data  argue for potential benefits that will trickle to multiple stakeholders 3. Strengthen weak incentives, aligned the incentives when facing a diverse stakeholder community, generate incentives when none exist
  22. 22. Principle of actions 4. Prioritize the digital collections based on projected future use  careful selection of which digital assets to save, especially materials of greatest use to present & future stakeholders  the decision to preserve now need not be a permanent or open-ended commitment of resources over time 5. Stakeholders' roles & responsibilities should be transparent & accountable  organizations should have clear policies the specify their roles, responsibilities, and procedures  collective interest must be aggregated, and the effort & the cost must be appropriately apportioned 6. Funding models must fit the community norms  digital assets need not always be a public good  funding models should be flexible to adjust to disruptions over time; create an economy of scale whenever possible (especially scientific data & cultural assets)
  23. 23. Near-term priorities Organizational action  form public-private partnerships  ensure organizations have required expertise  achieve economies of scale & scope  address the free-rider problem Technical action  build capacity to support stewardship  reduce preservation cost  operationalize an option strategy for all types of digital material Public Policy action  ease copyright laws to facilitate digital preservation  generate incentives for private entities to preserve on behalf of the public  sponsor public-private partnership  empower stewardship organizations to avert loss of digital orphans Public Outreach action  provide training for curatorial skills  educate public the urgency for preservation of digital assets
  24. 24. York and HathiTrust Website Looks at the development of the HathiTrust Challenges of a large-scale digital preservation initiative Establishment and Purpose • Google • Members • Preservation Goals
  25. 25. Question: Looking at these areas, what are some of the major challenges you think might come about in developing digital preservation initiative? The Challenges: Governance Finance Repository Services
  26. 26. Challenges in Governance Types of BAD collaboration o "Goal Drift" o No buy-in from administrative bodies Tension o Perception that collaboration will limit independence of participants o Fear of slow decision-making process Solution: HathiTrust Governance o Executive Committee o Strategic Advisory Board o Constitutional Convention o Voluntary Membership
  27. 27. Challenges in Finance Funding Downfalls o Voluntary Membership - Potential dissolution of the partnership o Minimal Funding Sources o No long-term plan beyond 5 years Solution: o Formal evaluation at the 3-year mark o Will develop a succession and multi-year funding plan o Different Levels of partnership
  28. 28. Challenges with the Repository Trusted Environment o Trusted Digital Repository Certification did not exist o Time and Cost of certification Collaborative Development o Discovering redundancy o What version is the right version? Solution: o Certification, Standards, and Best Practice o Implication of having a unified digital repository
  29. 29. Challenges with Services Basic Access o Print-disabled users o Compliance with accessibility standards Search o No interface for searching o No comparable models for searching across institutions of this magnitude Extended Capabilities o Integration with with software/primary source collections o Print on-demand o Inter-institutional authentication and security
  30. 30. Critical Observations Governance Duties of new governing bodies not explained in much detail Finance Failed to look at funding sources outside the partnership Repository Cost of long-term preservation sustainability
  31. 31. Walter – Cost of Not Preserving Cost of digital preservation = 'benign neglect'; cultural either choose to preserve today, or defer the preservation to tomorrow  benign neglect misses the fact that digital assets are vulnerable & storage media are unstable 1. Cultural cost:  intangible cost of narrow understanding of our cultures & histories by current & future generations 2. Political cost:  loss of resources & documentations essential for understanding local, state, national, & international developments 3. Scientific cost:  loss of data for all areas of research needed for academic advancement Libraries that begin early towards digitization and content creation efforts will benefit from better acquisitions, more users, higher quality of users & financial resources  increase their prestige & bottom line
  32. 32. The MetaArchive Cooperative Model  Founded in 2003 as a community-owned & operated digital preservation network  Cooperative model: all members contribute monetarily, staff, technology & space  reduces cost for all cooperating parties & increases sense of joint ownership  expanding membership fees and cooperative-oriented staffing replace initial public funding from the Library of Congress  Adopt LOCKSS software: all members host servers within their institution, but are connected in a peer-to-peer network  avoid a central cache
  33. 33. MetaArchive Members  >50 institutions in 13 states & 4 countries
  34. 34. MetaArchive Cost 1. Establish the 1st private LOCKSS network (with NDIIPP funding) 2. Transform into a sustainable 501c3 charitable organization (with NHPRC & NDIIPP funding) 3. Provide ongoing preservation training & services to the cultural community (with membership & consulting fees) Cost components are mainly expert personnel 1. Collaborative relationship-building 2. planning & policy making 3. staff training 4. selecting & implementing network systems 5. developing & maintaining software 6. selecting digital assets for preservations 7. documenting the digital assets 8. preparing the assets for the preservation network 9. assessing and monitoring the assets in the network 10. infrastructure
  35. 35. Current cost Basic costs:  Equipment = $4600 for a server  Staffing = 2% of a systems administrator’s time, software engineer  Storage = $1/GB/year for network storage Membership fees  Sustaining members = $5500/year, typically lead institutions in the field  Preservation members = $3000/year, mainly participants & beneficiaries Sample costs: For an institution that want to preserve 2 TB of :  Sustaining Member: [$5,500 (membership) +$2,000 (space) x 3 years] + $4,600 (server) = $27,100/3 years, or $9,033/year  Preservation Member: [$3,000 (membership) + $2,000 (space) x3 years] + $4,600 (server) = $19,600/3 years, or $6,533/year

×