The PERICLES project aims to facilitate continued understanding and reuse of digital objects over time through modeling heterogeneous, complex, and interconnected digital content and its environment. The project will develop ontologies and models to represent digital objects, policies that govern them, and their dependencies and changes over time. It will implement these techniques in test beds using case studies from different domains like science data and art to evaluate the approaches and ensure relevance. The models and ontologies will be used to prototype preservation workflows and assess the impact of changes on digital ecosystems.
PERICLES workshop (London 15 October 2015) - Introduction
1. GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3
Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving
Semantics [Digital Preservation]
Simon Waddington (King’s College London)
3. The PERICLES project
Project objectives
Case studies
R&D activities
4. PERICLES: " Promoting and Enhancing Reuse
of Information throughout the Content
Lifecycle taking account of Evolving
Semantics “
EC FP7 Integrated Project, Digital Preservation
(Feb. 2013- Jan. 2017). 11 partners.
5. Facilitate continued understanding and reuse of
digital objects that are:
◦ heterogeneous, volatile, complex and highly interconnected
Represent, derive and enforce policies that govern
◦ the management and evolution of content
◦ the management and evolution of the policies themselves
Integrated test beds addressing different application
domains and users
◦ Use for prototyping and evaluation
Sustainability of project outputs
◦ Gathering and disseminating the knowledge created by the
project
6. Model-driven
◦ Abstraction of complex systems as models that can be manipulated
independently
Capture and modelling of the environment
◦ Understand the wider context around digital objects that impacts their
long-term reuse
Digital ecosystems
◦ Analogy with biological systems
◦ Evolving systems of interdependent entities
Continuum approach
◦ Merging of active-life and archival phases
◦ Non-custodial
Case studies
◦ Ensure relevance of results to practitioners. Should be extensible
◦ Used for requirements, test scenarios, evaluation, sample datasets
7. Focuses on space science data
originating from the ESA and
International Space Station.
For example
◦ Experiments that monitor the sun's
spectral variability to understand its
effects on climate (SOLAR)
Raw data and telemetry are captured
by the SOLAR instrument
◦ Data are calibrated by solar scientists
The final dataset is made available to
◦ Scientists in other fields (e.g. climate)
◦ Users of other instruments
Complex dependencies
◦ Results are dependent on a complex
processing chain
8. Provided by
Tate
◦ Involves a
number of
different
departments and
content types
• Active use +
mandate for
LTDP
• Active use +
mandate for
LTDP
• Active use• Traditionally
end of use +
mandate for
LTDP
Collected
Born-digital
archives
Video
production
Digital
video art
Software-
based art
9. Software-based artworks
◦ Self-contained or networked systems
◦ Comprise hardware and software
elements
Proprietary/open source/custom
software
◦ Typically involve cutting edge
technology
Unique and challenging to maintain
◦ Unlike physical artworks, often
necessary to replace elements
Works can exist in multiple versions
◦ Complex dependencies
Changes to one element of an SBA can
have an impact on other parts
Sow Farm by John Gerrard
Brutalism, by Jose
Carlos Martinat
10. Sampling-based approaches
◦ PLANETS
Test bed – implement and evaluate potential changes on
sample sets of data
Works well for individual objects – when we consider
environment there are many options
◦ SCAPE
Scalable approach to PLANETS
Descriptive models
◦ CASPAR
◦ Preservation Network Models
PERICLES
◦ Build models automatically
◦ Use models for computation
11. PERICLES uses an iterative stakeholder-driven
approach
Year 1 Year 2 Year 3 Year 4
1 2 3 4 5 6 7 8 9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
2013
2014
2015
2016
2017
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Requirements and development phase 1
Evaluation
1
Development phase 2
Evaluation
2
Development phase 3
Eval 3
12. Purpose of test bed is
to run experiments or
“test” scenarios
◦ It is not an
implementation of a
preservation system
◦ Aim is to showcase the
research results
Main components
◦ ERMR: Entity Registry
and Model Repository
◦ Process Compiler
◦ Workflow Engine
14. Models and ontologies
Change and dependency
Model driven-preservation
Population and use of models
15. Model – abstract representation (of some
aspects) of a digital ecosystem
Knowledge modelling
◦ Machine-processable way to express concepts and
their relationships
◦ Reasoning to derive further knowledge
Behavioural modelling
◦ Encapsulation and visualisation
◦ Enable simulations and predictions
◦ Reduce effort
16. An ontology is a formal naming and definition of
the types, properties, and interrelationships of
the entities that really or fundamentally exist for
a particular domain of discourse
◦ Linked Resource Model (LRM)
◦ Utility ontologies
◦ Domain ontologies
◦ Specialised ontologies
17. Behavioural change
◦ Requirements, technical change, change to
attributes, organisation structure, dependencies or
entities that may affect other entities
Semantic change
◦ user community, knowledge
If significant change occurs, it may require
modifications to the ecosystem
18. Concept of change versus dependency
Given objects A and B. A is dependent on B if
changes to B have a significant impact on the
state of A, or if changes to B can impact the
ability to perform function X on A.”
Depends onEntity A Entity B
19.
20. Model editor
◦ Manual editing through a GUI
Policy editor
◦ Constrained Natural Language policy descriptions
Ontology design patterns
◦ Reusable components that can be used across models
Semantic extraction from text
◦ Populating the ontologies with instances
VERGE
◦ Scalable feature extraction and feature processing from images and video
PET tool
◦ Sheer curation tool running in background
◦ PET2LRM
21. Linked Resource Model (LRM) is a modelling language
◦ Abstract concepts e.g. resource, dependency, intent
Domain ontologies
◦ Describe properties of model elements e.g. video
Ecosystem models
◦ Description of fundamental ecosystem components, e.g. user,
policy, technical service and their relationships
◦ Tools and APIs
Domain
ontologies
Ecosystem models
LRM
22. Ongoing work in years 3 and 4
Model Change and Impact Explorer (MICE)
◦ Visualisation of digital ecosystems and the impact
of change on them
Quality assurance
◦ Enforcement of policies via models
Appraisal (on going)
◦ Technical appraisal of complex digital objects