Successful preservation of content requires sophisticated mechanisms for collecting, tracking and analyzing information about a multitude of relevant aspects. This is not limited to content itself, but also tracking of available software, other organization’s content, usage statistics and trends, format risks, systems operations and many more. Such tracking requires a flexible system that supports evolution over time and provides an extensible platform for scalability.
This presentation describes the system design of a novel approach towards automated monitoring of preservation-related information. We discuss the challenges and information sources that need to be covered, and describe the architecture and data model of a novel preservation watch system, currently under development. We discuss how this system addresses critical information needs for informed preservation management and outline next steps ahead.
2. SCAPE
Outline
• Preservation monitoring
• Why and what is needed
• State of the art
• A novel approach
• Methodology
• Architecture
• How can you participate?
2
4. Why do we need monitoring?
Format obsolescence
New standards Emerging technology
Repository
Organisation Producer trends
Bit rot
mission
Resource capability
Organisation System availability Consumer trends
policies
Security breach
Economical limitations Social and political factors
4
5. Why do we need monitoring?
Format obsolescence
New standards Emerging technology
Repository
Organisation Producer trends
mission
Bit rot
Resource capability
Risks
Organisation
policies
System availability Consumer trends Opportunities
Security breach
Economical limitations Social and political factors
5
6. SCAPE
State of the Art
• Digital Format Registries
• AONS
• Technology watch reports
6
7. SCAPE
State of the Art
• Digital Format Registries
• Lack of coverage
• Statically-defined generic risks
• Lack of structure in risks
• Focus on format obsolescence
• AONS
• Total dependency on format registries
• Technology watch reports
• Machine unreadable
7
8. Risk Assessment
Yes but manual and adhoc
None
40%
Survey on: 60%
8
9. SCAPE
What is needed?
• We need data!
• From anywhere and everywhere
• Sharing
• Usability & Scalability
• Structured data
• Controlled vocabulary
9
12. Scout
? Tool
Name
Format
Name
Version PRONOM ID
Renders Mime type
License License
12
13. Scout
? Tool
Name
Format
Name
Version PRONOM ID
Renders Mime type
License License
12
14. Scout
? Tool
Name
Format
Name
Version PRONOM ID
Renders Mime type
License License
12
15. Scout
? Tool
Name
Format
Name
Version PRONOM ID PRONOM
Renders Mime type
License License
13
16. Scout
? Tool
Name
Format
Name
Version PRONOM ID PRONOM
Renders Mime type
License License
13
17. Scout
? Tool
Name
Format
Name
Version PRONOM ID PRONOM
Renders Mime type
License License
14
18. Scout
? Tool
Name
Format
Name
Version PRONOM ID PRONOM
Renders Mime type
License License
15
19. SCAPE
Methodology
• Survey practitioners on information to be monitored
• Create structured data model
• Find sources of information (try to automize)
• Define notification triggers
• Frequent monitoring of sources of information
• Frequent assessment of triggers and notification
16
20. SCAPE
Example questions
• Are there any tools that can render the format X?
• Is my repository the only one that has format Y?
• Are my preservation plans still valid?
• Are my repository policies being enforced?
17
21. SCAPE
Information Sources
• Format registries & software catalogues
• Digital repositories & web archives
• Organizational objectives
• Experiments
• Simulation
• Human knowledge
18
23. Repository events adaptor
RODA
RODA
Report API
Rosetta OAI-PMH Repository
Rosetta Scout
Report API adaptor
eSciDoc
eSciDoc
Report API
• OAI-PMH with PREMIS • Events example
• Normalize events • Ingest started/ended
• Representation downloaded
• Fine-grain events
• Plan executed
• History
20
27. SCAPE
How to be a part of Scout
• Join the surveys
• Send your email to me <lfaria@keep.pt>
• Integrate your content
• Send your content profile with C3PO
• Send repository events with Report API (soon)
• Contribute with information (soon)
• Use Scout form for manual input of knowledge
24
28. SCAPE
Scout
h"p://www.scape-‐project.eu
Watch us on
GitHub!
h"ps://github.com/openplanets/scout
Luís
Faria
<lfaria@keep.pt> h"p://www.keep.pt