To introduce and describe some of the work that has been done to help institutions and research groups understand both the costs and the economics of preservation
To describe ongoing phases of JISC-funded work that are attempting to further advance understanding and implement approaches in this area
To give some indication of where collective international effort may be of universal benefit.
Ipres 2011 The Costs and Economics of Preservation
1. The Costs and Economics of Preservation
Objectives –
• To introduce and describe some of the work that has been done to
help institutions and research groups understand both the costs and
the economics of preservation
• To describe ongoing phases of JISC-funded work that are attempting
to further advance understanding and implement approaches in this
area
• To give some indication of where collective international effort may be
of universal benefit.
Neil Grindley – JISC Programme Manager (Digital Preservation)
2. The LIFE Project
University College London (UCL) and British Library (BL)
To develop a methodology to model the digital lifecycle and calculate the costs of
preserving digital assets over a period of years. http://www.life.ac.uk/
3 phases of work
LIFE 1 (2005-2006) – A review of existing models to produce a 6 stage digital object
lifecycle model; incorporating a generic preservation model; and 3 test case studies.
Web Archiving (BL); e-Journals (UCL); Voluntary deposited electronic publications (BL)
LIFE 2 (2007-2008) – 3 further case studies: Digitised newspapers (BL); SHERPA-LEAP
repositories; SHERPA DP digital preservation services. Model refinements. An
independent economic review. Analysis of paper vs. digital costs.
LIFE 3 (2009-2010) – Further refinements. Another case study. Storage costs survey.
Development of a web-based tool based on the LIFE model spreadsheet (with HATII –
University of Glasgow/DCC)
3.
4. Piloting the LIFE Costs Tool in UK HEI’s - 2011
HATII (DCC) – University of Glasgow
http://www.dcc.ac.uk/projects/life
UK Higher Ed Institutional repositories were invited to take part in two strands of
activity:
• Review the LIFE web tool and provide feedback via a survey
Survey Questions
• Keep an activity journal for a month to assist with evaluation of the LIFE model
An Activity Journal
5. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
Pilot Participant feedback …
The Tool and the Interface The LIFE Model
Split opinions about usability (3 yes 4 no) Inability to deal with mixed content
Slow and uninformative interface ‘Video’ content type missing
User interface layout and procedure issues Difficult to assess accuracy
Apparent figure rounding errors ‘Basic input’ page too basic
Lack of information about fields & units ‘Refine’ pages too detailed
‘Refine’ pages difficult to use Apparent figure rounding errors
Alternative ways of grouping values req’d Lack of information about fields & units
Alternative ways of grouping values req’d
6. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
Pilot Participant feedback …
Two key aspects were particularly identified as parameters that users needed to be able
to modify easily:
• Staffing
• Infrastructure and policy (specifically – storage and backup)
Also good to have …
• Costs over time as well as across the lifecycle on the output page (costs vary over the
course of a project)
• More visible inflationary factors
• Some form of graphical representation on the output page
• Reporting functionality
7. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
Pilot Participant feedback …
Activity Journals
CONSTRAINTS
• Very small number of users – short timescale (1 month) … inadequate assessment of a
highly complicated series of activities.
• Journals only covered staff effort and not capital costs
• Extensive use was made of the ‘other activity’ fields
• Participants generally didn’t add any notes about their logged activities
• Difficult to map the specified lifecycle phases onto the LIFE model
• Institutions dealing with a variety of content (rather than a homogenous collection)
• Participants not necessarily dealing with ALL repository activity
• Staff costs not entered coherently across all participants
8. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
Pilot Participant feedback …
Activity Journals
Journals were returned by 12 repository staff from 3 different repositories. They recorded
activity data over the course of a month. Activity was mapped against an adapted version
of the UKRDS Responsibilities Spreadsheet
Research Life Cycle Phase ANDS Verbs Responsibilities
Write the data plan / responsibility for meeting good standards practice
Aid in experimental design and planning (and execution, contributing own
Idea/Study Concept/Design Conceptualise insights)
Conceptualisation of data
Other Idea/ Study Concept/ Design activity
Advice on funder requirements
Funding
Other Funding activity
Metadata creation, its format, documentation etc.
Set internal data management policy
Research Activity: Data
Create/receive IPR, legal issues
Gathering/Collection
Gathering data
Other Research Activity: Data Gathering/ Collection activity
8 categories in total …
44 activities in total
11. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
46% of activity was categorised as ‘other’ work … i.e. not specifically categorisable by
the chosen schema … and subsequently not easily mapped onto any particular part
of the LIFE Model.
The principle activity that was significant and could be mapped was …
• Metadata creation, its format, documentation, etc.
- Included in the ‘Ingest’ section of the LIFE Model
The second most significant specific activity was …
• Aid in experimental design and planning (and execution, contributing own
insights)
- But this couldn’t be mapped … this is more to do with the ongoing work
required of a repository officer: LIFE is optimised to focus on a single
bounded project
Other items on the list were also of this type
12. Piloting the LIFE Costs Tool in UK HEI’s - 2011 (HATII/DCC)
Recommendations & Conclusions
The architecture of the Web Tool and its database dependency needs careful consideration
The user interface of the Web Tool needs to be de-coupled from the Spreadsheet Tool
Alternative ways of displaying outputs should be considered
More consideration needed for how users might want to modify details of the model
Make the various economic factors influencing estimates more visible
Include provision for expressing the maturity of an organisation & its existing resources
Currently, the LIFE Model is only really applicable to a certain type of project
The LIFE Model will require ongoing maintenance, data input and refinement
And lastly …
This type of work is VERY difficult and still represents a huge challenge for
institutions and for funders
13. Keeping Research Data Safe (KRDS)
Charles Beagrie Ltd. & various partners …
To extend previous work on digital preservation costs, but focus on research data. To
identify long-lived datasets for the purpose of costs analysis.
http://www.jisc.ac.uk/publications/reports/2008/keepingresearchdatasafe.aspx
Various phases of work
KRDS1 (2008) – List of key cost variables and units of record. Activity model. Major cost
categories. Case Studies
KRDS2 (2009) – Survey of cost information. Refined model. Benefits framework.
KRDS Dissemination (2010) – Fact sheet, user guides, summary activity model
I2S2/KRDS (2011) – Integration of the KRDS Benefits framework with the I2S2 Value
Chain Analysis Model
http://www.beagrie.com/krds.php
14. MAIN PHASES AND ACTIVITIES OF KRDS2 ACTIVITY MODEL
(“LITE”)
Outreach
Pre-Archive Phase Initiation
Creation The detailed KRDS2 Activity Model is
twelve pages long.
Acquisition
Disposal But perhaps the most significant
conclusion from KRDS …
Ingest
Archive Storage which also aligns with the LIFE Project
Archive Phase conclusion …
Preservation Planning
First Mover Innovation
Is that examining, gathering and
Data Management analysing relevant cost information
needs to keep on happening into the
Access
future …
Administration
Support Services
Common Services
Estates
15. The Costs Observatory Study (2011)
Scoping and Feasibility Study for an Information Management Costs Observatory
Key Perspectives Ltd. (UK)
Value to JISC
To judge whether it is a worthwhile investment of time and money to try and create a
‘Costs Observatory’
The Costs Observatory Concept
• To provide verifiable and evidence-based guidance to UK HEI’s about the likely cost
over time (the whole life-cycle cost) of managing, preserving and providing access
to their digital assets
• To influence the strategic planning and policy formation within institutions and
enable them to make wiser, more realistic and cost effective decisions about
managing information
Final Report available at:
http://repository.jisc.ac.uk/4921/
16. The Costs Observatory Study (2011)
What a Costs Observatory might do …
• Collect cost information
• Be a trusted broker
• Analyse the data and produce reports and recommendations
• Support the UK HE sector
• Monitor and identify relevant economic, legislative and environmental issues
• Liaise and co-ordinate with relevant service and information providers
17. The Costs Observatory Study (2011)
The ‘scope’ problem …
• What size and shape should this ‘Observatory’ assume …?
• What types of information should it address?
• To what extent is it being done already?
Agency Purpose
HESA Higher Education Statistics Agency (submission to HESA mandatory for UK HEI’s)
heidi Higher Education Information Database for institutions (subscription web-based service from HESA
TRIBAL Benchmarking service that collects data on costs across the institution, including all financial data across a range
of categories
Educause (US) Gathers cost data over a range of IT-related operations in HEIs and makes them available through its Core Data
Service
Gartner Inc. (Global) Provides a wide range of services across the business world, including gathering cost data on IT operations that
HEIs use for benchmarking
UCISA Provides IT-related information in the form of periodic reports on particular issues
SCONUL Statisitics Collects data over a range of library activities on an annual basis and virtually all UK HEI libraries submit data to
this service. It is light on activity-based costs
18. The Costs Observatory Study (2011)
Why it might NOT be a good idea …
• Lack of demand (an idea ‘ahead of its time’?)
• Data collection will be a sizeable and rather challenging task, requiring
considerable resourcing both by the Observatory and the participating
institutions
• Whether the required data can be adequately defined and whether sufficiently
accurate data can be arrived at by participants
• In the specific case of research data, accurate and representative cost data may
be extremely difficult to collect within universities
perhaps the – overarching issue for the Observatory is how to handle comparability …
It would need to:
• Clearly define the cost data elements needed
• Ensure that these costs are pieces of informaton that all types of institution could
come up with
• Ensure that the collection and submission of information was not too onerous for
institutions
19. The Costs Observatory Study (2011)
Why it MIGHT BE a good idea …
• Stakeholders think there is a gap in existing provision
• A Costs Observatory covering research information management and
preservation costs would complement existing services
• Research data management is becoming more important to institutions
• Other benchmarking services are well-used and considered useful. There is
growing emphasis on evidence-based decision-making, and evidence from
authoritative and trusted sources is valued
• Libraries generally enjoy a culture of information-sharing
• Research offices are generally positive about participation, as long as there is
clear value in it
• The REF (in the UK) may act as a strong driver for the service, as it has for the
development of repositories and CRIS in UK HEIs
• The market need not be confined to the UK and indeed there are good business
reasons for considering this as a potential international service
20. The Costs Observatory Study (2011)
The study concluded that the scope of information that any proposed ‘Costs Observatory’
should focus on is:
• The institutional research repository and associated operations
• The institutional research data repository (where present)
• The institutional research information system (RIS) and associated operations
• Any additional archiving operations and systems for research outputs or information
Some assertions
… “most institutions in the UK now appear to be settling on a formula that can be simply
described as ‘research repository + data archives + CRIS’ “ …
… ”the repository is, in some universities at least, regarded as the third most important
management information tool after the finance and student records systems” …
… “future REFs* will continue to influence record-keeping *…+. A Costs Observatory thus
would be a natural part of this ecology” …
*Research Excellence Framework – a periodic assessment of the quality of UK HE research that also helps to determine levels
of funding for research in UK universities
21. The Blue Ribbon Task Force for Sustainable Digital
Preservation and Access (2008 - 2010)
Objective
To develop a set of economically viable recommendations to catalyze the
development of reliable strategies for the preservation of digital information.
Final report (February 2010)
http://brtf.sdsc.edu/
Big detailed report focusing on economics … some synthesis required?
22. A Draft Economic Sustainability Reference Model (2011)
The challenges to effective sustainability (preservation) are:
• Long time horizons
• Diffused stakeholders
• Misaligned or weak incentives
• Lack of clarity about roles and responsibilities among stakeholders
• Difficulty in valuing or monetizing the costs and benefits of digital preservation
Three principal actions are required for sustainability:
• Articulate a compelling value proposition
• Provide clear incentives to preserve in the public interest
• Define roles and responsibilities among stakeholders to ensure an ongoing and
efficient flow of resources to preservation throughout the digital lifecycle
Brian Lavoie (OCLC) and Chris Rusbridge (Consultant) came up with the idea of
trying to turn the Blue Ribbon Task Force conclusions into some form of reference
model.
25. Preservation
Process
Property 4 Property 5 Property 6
… is a stream of
… is path-dependent … has finite resources
decisions over time
Sustainability Condition 4 Sustainability Condition 5 Sustainability Condition 6
Finite Planning Evaluate Opportunity
Horizons Cost of Inaction
Selection
27. Next opportunity to think about and develop this model ...
7th International Digital Curation Conference
Bristol, UK, 5 - 7 December 2011
Thursday 8th December 2011
http://www.dcc.ac.uk/events/idcc11/workshops
28. Opportunities for international collaboration and join-up
The LIFE Model is currently the most sustained attempt in this field to work out the
long-term cost of preservation.
• who is using it and how?
• how can it be improved?
• if it needs ongoing input and maintenance, how should it be sustained?
The KRDS Framework has been influential and the management of research data is
not a problem that is going away any time soon – for anyone!
• what is the best way of using this knowledge?
Should both of these initiatives inform a service that would specialise in the
financial aspects of the long-term management of digital information in the
research/teaching domain?
• might this usefully be underpinned by an economic reference model?