EuroSakai CLIF project presentation
Upcoming SlideShare
Loading in...5

EuroSakai CLIF project presentation



A presentation given at the EuroSakai 2011 conference in Amsterdam on 27th September 2011. It covers the work of the CLIF project to investigate the management of the digital lifecycle across ...

A presentation given at the EuroSakai 2011 conference in Amsterdam on 27th September 2011. It covers the work of the CLIF project to investigate the management of the digital lifecycle across systems, using the integration of the Sakai collaboration and learning environment with the Fedora digital repository system as an exemplar.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • The lifecycle diagram that came out of REMAP, entirely contained within the repository.
  • This lifecycle is given as an example, to emphasise that there are multiple stages, but that CLIF was not in the business of creating its own version. A focus on the individual stages was found to be more beneficial as this allowed us to be generic in our approach to the integrations carried out (as the stages were common between them).
  • These characteristics also came from the literature review
  • The Sakai part of this architecture should have a box indicating use of the Hydranet code, as for SharePoint. This highlights that a common means of communicating with Fedora was found based on the Fedora web services and the construction of Hydra-compliant objects. In Sakai this interacted with the CHH, whilst in SharePoint it interacted with a couple of components – workflow, deposit and browse
  • Based on interviews and demonstrations with local academics and records managerEvaluation was related to how the digital content lifecycle could be managed bearing in mind the integrated functionality demonstrated
  • The reference to the structure of the content is a reference to the Hydra work and the benefit this brought the project.The last point refers to the situation of using content in situ in other systems, without moving it.
  • Please note that if there is nothing on the GitHub site yet, it will be soon!

EuroSakai CLIF project presentation EuroSakai CLIF project presentation Presentation Transcript

  • Enabling the digital content lifecycle: content flow between Sakai and Fedora
    Chris Awre
    Library and Learning Innovation
    Amsterdam, 27th September 2011
  • CLIF Project
    CLIF - Content Lifecycle Integration Framework
    Funded by JISC
    01 July 2009 – 31 March 2011
    Project partners
    University of Hull
    King’s College London
    Centre for e-Research (CeRch)
  • Background
    • CLIF is building on work within the JISC-funded RepoMMan and REMAP projects
    • In particular, REMAP explored how a repository could support records management and digital preservation as part of a lifecycle management approach for digital content
    • Previous work had sought to push the repository upstream in the workflow
    • Dilemma was that the repository risked becoming another content silo alongside other content management systems on campus (in our case, Sakai and SharePoint)
    • How can the repository become more integrated in the institutional environment?
  • Fedora
    • Powerful digital repository framework
    • Adopted at University of Hull in 2005
    • Live institutional repository since 2008
    • Developed and managed through DuraSpace
    • Strong community model, akin to Sakai
    • Features we like (the advert!)
    • Powerful digital object model
    • Extensible metadata management
    • Expressive inter-object relationships
    • Version management
    • Configurable security architecture
  • Local repository need
    • Scalable solution (not one that has upper limit)
    • Digital content is only going to grow
    • Standards-based (open standards where possible)
    • To provide a future-proof exit strategy
    • Content agnosticism
    • We don’t know what types of content may come along
    • Content semantics
    • Recording the relationships between different pieces of content supports future use and preservation
  • Other repository systems?
    • The focus of the work was based around systems that were in place at Hull
    • Other repository options were not actively considered
    • Following on from work looking at integration of DSpace and Sakai through CTREP project
    • Aimed to achieve the same end goal of seamless integration for Fedora
    • Regardless of the system, it is important to understand what you are trying to achieve in the management of content through integration
    • Repository choice driven by external factors of how repository management is carried out
    • CTREP project was a JISC-funded project, 2007-9
    • Aimed to increase repository usage through integration within the LMS, using Sakai as the platform
    • Cambridge examined integration with DSpace
    • University of Highlands & Islands (UHI) examined integration with Fedora
    • Work focused on use of Sakai ContentHostingHandler
    • DSpace work successful, albeit that information being sent between the two was limited
    • Fedora work halted as it became clear that the version of Sakai CHH at the time was not able to deal with rich Fedora objects
    • Re-visiting this has been possible through Sakai developments
    • We are grateful to CTREP for pioneering this approach
  • Lifecycle
    within a
    Can this be
    enabled across
  • Lifecycle integration
    Content flows between systems according to need in lifecycle
  • Sakai and content management
    • Content management for teaching & learning makes heavy use of the Resources tool
    • Some imaginative ways used for how content from here is used by other tools within the system
    • Content is also shared between sites, and staff are encouraged to make their content shareable
    • Focus of content management is to support use within Sakai
    • Focus is on Sakai, not the content
    • A content silo?
    • How could integration with a content store – a repository – enhance how Sakai manages and uses content?
  • CLIF project objectives
    • Understand how digital content can be managed across systems as part of the digital content lifecycle
    • Recognising that individual systems cannot always support the whole lifecycle from creation to preservation or deletion
    • Specifically investigate the role of repositories in the digital content lifecycle
    • Where is the repository best positioned within the lifecycle?
    • What roles can digital repositories play?
    • Understand how content will flow in and out of a repository as part of the lifecycle
    • CLIF has been agnostic about this
  • CLIF use cases I
    • Use cases cover research, teaching and administration
    • Based on interviews with staff at partner institutions
    • Academic staff (Head of Department / Senior Lecturer)
    • Records Manager
    • Research active staff
    • Interviews highlighted that staff were managing as best they could within single systems they were familiar with
    • Potential to exploit additional functionality in other systems welcomed
  • CLIF use cases II
    • Research
    • Capturing data produced through experimental equipment and archiving this for use in future work in the repository
    • Preparation of research outputs and archiving of these for dissemination
    • Teaching
    • Teaching materials accessed from within a repository to inform current courses
    • Exam papers created in one system and archived for future reference in the repository (marks could be archived for private access as well)
    • Administration
    • Committee papers circulated to committee members before a meeting are moved to the repository for wider access post-meeting
  • CLIF outputs
    • Literature review on managing the digital content lifecycle across systems
    • Technology integrations as exemplars of how a repository can support lifecycle management across systems
    • Fedora – Sakai integration
    • Fedora – SharePoint integration
    • Software available on GitHub
    • Technical appendix to final report describing architecture and implementation
  • A digital content lifecycle
    There are many variations and
    versions of lifecycle models
    - another is not required
    Each has a number of stages
    CLIF sought to capture use cases
    that encompassed a number of
    these stages and tested how they
    could be managed across systems
    © Digital Curation Centre
  • Literature review
    • There was little literature directly addressing the system aspects of managing the digital content lifecycle
    • Work was focused within a system or was more architecture-based without addressing specific systems
    • Possibly due to flux in technology development
    • Terminology is key to addressing lifecycle management
    • There are many different lifecycles (knowledge, digitisation, metadata, etc.) that may overlap
    • Can be easier to break down the lifecycle into stages, many of which are common
  • Lifecycle characteristics
    • The use of standards can greatly ease movement between systems
    • cf. the use of the Hydra digital object approach
    • Policy is as important as technology in determining how different systems are used to manage a lifecycle
    • Digital preservation can be greatly supported if considered at the beginning of the lifecycle (as REMAP found)
    • There is a need to identify how people and roles fit into an overall lifecycle
    • It may be valuable to record information about the lifecycle itself as content moves, but this has resource implications
    • cf. the use of PREMIS events metadata recording what happens to an object
  • System overview
  • Sakai – Fedora integration
    • Sakai 2.6.1
    • Fedora v3.4
    • Extends and enhances the JISC CTREP Fedora ContentHostingHandlerplugin
    • CHH is a pluggable provider model for hosting content
    • Content displayed in standard Sakai Resources Tool
    • Enabled and Configured by uploading a text file
    • Resources Tree view shows a ‘live view’ of a specific Fedora collection
    • ‘Show other sites’ allows files and/or nested folders to be copied/moved between MyWorkspace site and Fedora mounted site
  • .properties configuration file
  • Sakai to Fedora
  • Or…
    Resources Tool
  • Linking Sakai and Fedora
    • Content held in Sakai and Fedora are held very differently
    • Sakai holds files
    • Fedora holds objects made up of a collection of datastreams, one of which is the file (others will contain metadata)
    • In linking Sakai and Fedora, three considerations needs to be addressed
    • Displaying Fedora objects in a tree structure and Fedora collections as folders
    Issue for security around the objects
    • Depositing a file in Fedora from Sakai requires a Fedora object with associated metadata to be created
    • Retrieving a file from Fedora for use in Sakai requires use of the search capability within Fedora
  • Lessons learned
    • SOAP messaging between the two systems made the link very slow
    • Due to use of HTTPS
    • Switching to HTTP improved performance and allowed easier debugging
    • Other performance improvements enabled included,
    • Caching of resources and folder objects
    • Minimising web service calls by sing one call to retrieve multiple properties
    • No pre-fetching of datastreams
    • The CHH code is over-complicated at times
    • Impact of changes at high level can be extensive lower down
  • Sakai – Fedora features
    • The repository is embedded as a set of resources that appear like any other set of resources
    • The majority of menu functions work in the same manner as with standard resources, e.g., upload, copy, paste, move, delete, create
    • This applies to folders as well as individual objects
    • Folders represent collection objects in the repository
    • Metadata can be captured in Sakai for use in Fedora (though Sakai is not able to re-use this when retrieving an object from Fedora)
    • User can browse Fedora collection (though not yet search)
    • User does not need to know they are working with the repository
  • Fedora 2
    • Very flexible – this has made exchanging objects between Fedora instances and between Fedora and other systems difficult
    • Common approach to structuring digital objects is required
    • Systems interacting with Fedora can build objects using this common approach
    • CLIF adopted the approach developed through the Hydra project
  • Fedora 2 contd.
    • Common structuring/modelling approach allows for object metadata to be edited in the repository as part of their lifecycle management
    • Each object has:
    • …and could have…
    descmetadata (using MODS)
    • If Sakai can provide this
  • Copy/move to/from Repository
    Copy & move folders/files between Fedora and MyWorkspace is easy ! Copy…
  • Copy/move to/from Repository
  • It looks easy, but…
    © 2008 Richard Green
    … you don’t see what is going on underneath!
  • Outstanding work
    • Managing versions from within Sakai, or accessing them, isn’t currently possible
    • Some of the commands under the Edit functionality have no current effect on the object in Fedora
    • The metadata captured is minimal, and Sakai cannot make use of metadata added within Fedora
    • Folders with large numbers of resources have a noticeable impact on performance when browsing or carrying out actions upon them
  • Evaluation
    • There needs to be a clear understanding and view about where the boundaries are between the different systems being used, to avoid confusion
    • There needs to be clarity over why different systems are being used, to overcome concerns about having to work with multiple systems
    • There is a need for better preservation and a recognition that integrating the repository could support this, but also a need to be clear about what needs preserving
    • There is benefit in being able to access other content stores from within your current working environment in order to see what is available more broadly
  • Sakai-repository evaluation
    • The seamless access was much valued
    • Having access to resources that could be used within Sakai was a valuable addition to being able to browse resources inside Sakai
    • Providing access to resources in context was considered very important, hence, linking to the files in the repository instead of copying them across may be preferred
    • Why create a copy if access is OK where the content is?
    • Reference or irregular content was considered to fit best into the model of access via repository
    • Bulk movement likely to be more useful than object by object movement
  • Sakai OAE
    • Focus on presentation of content in context
    • This tallies with findings in CLIF
    • Focus on use of APIs where available
    • Institutional repository systems are not so good at this
    • A challenge for these systems
    • Capturing annotations alongside original content would enhance archival records
    • Exporting multiple resources, as IMS CP or other, also a route for managing content across systems
  • Conclusions
    • Diverse content management systems can be effectively integrated to allow cross-system lifecycle management
    • Better adoption of interface standards would be helpful
    • Standardisation in the structure of the content being moved maximises how the content can be managed by the different systems
    • Where the repository is one of the systems involved its current primary role appears to be as a recipient of content (for preservation)
    • Perception that content in the repository can be used there without moving it into the other integrated systems
  • Demo
    Copyright ©
  • Thank you
    Chris Awre –
    Richard Green –
    Andrew Thompson –
    Simon Waddington –
    Project website -
    Project GitHub - and
    Project final report -