SlideShare a Scribd company logo
1 of 31
GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3
Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics
[Digital Preservation]
“This project has received funding from the European Union’s Seventh
Framework Programme for research, technological development and
demonstration under grant agreement no601138”.
Choice of IE Technique
Anna-Grit Eggers (University of Goettingen)
Encapsulation
Techniques
Features
• IE techniques cover a wide range of uses which differ regarding the:
•processing velocity
•required disk space
•location of storage
•accessibility and perceptibility of the payload (by human / by machine analysis)
•preservation level of the carrier / digital object
•processability of digital object and payload file formats and file sizes
•provided compression mechanisms.
Range of uses
• We identified criteria to distinguish between the techniques.
Criteria
• An encapsulation technique that fits for a specific use
scenario can be chosen based on the technique specific
characteristics of these criteria.
• Definition of criterion in this context:
A property or feature of information encapsulation techniques that can be
used to compare different techniques on the basis of the criterion
characteristics.
• For example:
•Robustness of encapsulated information after encapsulation with an algorithm
towards processing of the carrier.
•Perceptibility of the encapsulated information by an observation of the carrier.
Examples of characteristics for this example:
• The characteristic of a technique for the criterion “Robustness” can be:
• “robust” (“true”)
• “not robust” (“false”)
• The characteristic of a technique for the criterion “Perceptibility” can be:
• “visible”
• “not perceptible by humans”
• “detectable by computers”
• “not perceptible at all”.
Criterion characteristics
• The assignment for this criterion is harder, because the threshold for
“human perceptibility” or “computer detectability” is blurred and not a
“true/false” value.
Encapsulation
Techniques
User scenario
• IE techniques are used for a specific purpose in the context of a use
scenario.
• This usage scenario defines features and characteristics that are desired to
be fulfilled by a potential IE technique.
• The overall task is to find the best IE technique by capturing and evaluating
the scenario defined by the user.
• The user could aim for encapsulating messages or metadata with digital
files.
• The aim could also be to add legal information, ownership information,
corporate designs, or information to ensure the authenticity of a digital
object.
Creating a scenario
• Some need a visible payload, others prefer to hide it.
• The most valuable information might be the digital object or the payload
itself (this is often the case with steganographic messages).
Creating a scenario (cont.)
• Three procedures are required to implement the scenario:
•scenario capturing
•weighting
•decision calculation mechanism
Implementing a scenario
• Capturing:
a questionnaire which requests the importance of a set of scenario criteria
for the given scenario. The criteria are chosen in a way that they can be
mapped to the features of the IE techniques.
• The amount of investigated criteria correlates with the amount of available
IE techniques: It should be high enough to be able to distinguish between
all main techniques, but low enough that the user won’t be overwhelmed
while filling the questionnaire.
• Weighting:
Another crucial aspect is to ask the user
• which of the criteria are important for the scenario
• how important they are,
• which should be excluded because not pertinent.
Implementing a scenario
• The Analytic Hierarchy Process is a sophisticated but complex method:
criteria are compared to each other and the user has to decide for each
comparison which criterion is the more important one.
• A simpler approach is to include an option to exclude unimportant
criteria and add a weighting mechanism for the user to indicate how
desirable a characteristic of a criterion is, or how important it is for the
scenario.
● There are two types of criteria to consider:
Decision criteria
• must be fulfilled to be able to use a
specific algorithm.
• File formats are an example of a
technical criterion, because some
algorithms can only be used for
specific file formats.
Technical
criteria
• depend on a usage scenario or the
user preferences.
Scenario
criteria
Technical criteria:
◦ File formats (for carrier files as well as the payload files)
◦ Number of files
◦ Capacity
Technical decision criteria
File formats
• Embedding algorithms usually supports a set of file formats and cannot
be used on files with the wrong formats.
• For packaging, the metadata has to be mapped to one of the standard
XML packaging formats. The created packet reduces the risk of data
loss.
Technical decision criteria
Number of files
• A growing number of files increases the risk of losing one of them.
• With more than one file, the files can facilitate the identification process
of the belonging files by providing indications for the used formats. That
reduces the impact of file format obsolescence.
Capacity
• Capacity is the message size constraint: the number of payload bits that can be
embedded by an embedding algorithm into a specific digital object.
Technical criteria (cont.)
• It is influenced by the data format and used method. Some methods increase
the risk of damaging the carrier file, or the payload becomes visible if the
message size is too big.
• While packaging methods have no limit for the size of the payload files,
embedding methods mostly have a maximum payload size.
• Invisible watermarking and steganography embedding methods not only
become visible, if the payload size is too high, the cost for algorithmic
calculations will also strongly increase with the size of the payload.
• The use of an information frame can scale theoretically well for big payload file
size. Though it might be unproductive, if the data frame outsizes the original
digital object. In such a case the use of a packaging method can be considered
● Processability and robustness
● Complexity (space/time) of the algorithm
● Used disk space of the output
● Restorability of the carrier
● Risk of data loss
● Perceptibility
● Location of the encapsulated information
● Spreading, standardization
● Security, confidentiality
● Authenticity
Scenario criteria
• To be re-usable, digital objects need to be processable normally by
applications unhindered by its encapsulated information.
Processability
• Packaging techniques might require unpacking before processing
the digital object, thus consuming additional calculation power.
• Embedding techniques do not change the file format of a digital
object which therefore can usually be processed directly.
• A method that allows for encapsulated information to survive
processing steps is considered “robust”.
• In an ideal case the embedded metadata survives even file format
conversions.
• Robustness can be strong or weak. Weakness implies an additional
extraction step to keep the metadata safe.
Robustness
• In a scenario where the digital object is frequently viewed and
processed, the metadata has to be embedded with a robust method.
• Steganographic methods are often very robust:
• they take an attacker into account.
• the digital objects can be processed normally, because a usage restriction would betray the
hidden messages.
• Visible digital watermarks can be very robust.
• Imperceptible watermarks are often fragile or semi fragile,
to allow the recognition of authenticity violations.
• The use of available metadata fields is a very robust method that allows
even object conversions. This can be valid for information frames, too.
Parameters for calculating costs for algorithmic calculations and
resources for the encapsulation process:
Complexity of algorithms
● The time and space requirements related to the complexity of the algorithm
used for the encapsulation and recovery of the original digital object and
the environment information
● This includes costs for validation calculations and for the unpacking
algorithms of packaging strategies.
● Big O notation can help to express the algorithms behaviour in relation to
the embedding payload size:
http://web.mit.edu/16.070/www/lecture/big_o.pdf
● Frequent use a digital object and its metadata requires faster the
restoration time.
• The time for decompression has to be added if the data was compressed.
• Packaging mostly needs a lot more time for this than embedding.
• With robust methods, an extraction is not necessary before the reuse of the
object.
• Edition of available metadata fields is integrated in many programs, and
therefore not very time intensive.
• The extension of an information frame in itself is not necessarily time
intensive, whereas the embedding method used on the frame might.
Complexity of algorithms (cont)
• The difference of disk space needed for the enriched digital object in
contrast to the original digital object can be an important parameter for
preserving a large amount of data.
Disk space requirements
• Some methods offer the possibility of compressing the data, so that disk
space can be saved.
• Packaging container compress both payload and digital object.
• Embedding methods offering compression only compress the embedded
metadata and not the carrier.
• Packaging can save more disk space than embedding with compression.
• An integrity check for possible damage during compression.
• Compression, decompression and validation require extra calculation time
• Embedding methods mostly do not need much extra disk space.
Disk space requirements (cont.)
• With steganography algorithms changing only single bits, the size of
the digital object remains constant. This method, however limits the
capacity.
• The use of available metadata fields doesn’t need much disk space.
Compression can extend the capacity for these methods.
• Information frames need additional disk space proportional to the size of
the metadata files that should be stored.
• The encapsulation method has to ensure that the digital object and the
metadata can be restored.
• The digital object has to be restored in its original state unscathed.
• The integrity has to be verified by checksum comparisons.
• There are different levels of integrity, e.g. just to ensure that the significant
properties survive, or a bit exact replica.
• The significant properties have to survive in any case.
• A validation of the whole object is often simpler than the validation of the
significant properties.
• It is highly improbably that packaging damages the digital object. A
validation is easy, if the checksum was added to the metadata file.
Restoration
• Not all algorithms that are used for embedding are completely
reversible.
• Reversible embedding algorithms often embed the information of
how to reverse them into a defined location of the digital object.
• If metadata is converted in the encapsulation, by example by
compression, encryption, or format conversion, it might be
necessary to validate the metadata.
• The methods using available metadata fields or information frames
offer easy restoration.
Restoration (cont.)
• The following factors increase the risk of damage for the digital object or the
metadata:
•encryption usage
•information hiding
•compression
•processing
•conversion of the digital object
Risk of data loss
• Packaging stores the metadata in separate files. This guarantees access to
embedded information which in turn may help identify the related digital data.
• Data containers used for packaging mostly have standard formats that are
not as vulnerable as non-standardised formats.
• At the same time the risk of data loss is higher for separated files when
unpacking.
• For some embedding methods object modifications are inevitable.
• The term ’Data Hiding’ describes methods to embed information in a way it
is not perceptible by humans.
• Steganography and invisible digital watermarks are mostly detectable by
machines.
• For most preservation scenarios it is necessary to be able to detect the
encapsulated information.
• Data hiding increases the risk of losing the knowledge about the existence
of this data.
• To avoid this, the carrier can be tagged with a visible method.
• Packaging is always visible, whereas steganographic methods are usually
invisible.
Visibility
• The location where the metadata is encapsulated can be a decision
criterion:
•a separate file
•the exact location at a file
•the time dimension of an audio file
•the background noise.
Location
• The location of storage is the main difference between packaging
and embedding methods:
•Packaging stores the information in a separated file, mostly in a standardised XML
format.
•Embedding stores the environment information directly in the digital object.
• The embedding into the background noise has no influence on the
significant properties and doesn't need additional disk space.
• Therefore, the noise has to be clearly identifiable, to prevent damage of
the digital object.
• Using available metadata fields, or an information frame, do not
influence the significant properties of the digital object directly.
• Some embedding methods store information by changing elements of
the object, e.g. by inverting single bits of an image pixel, or by usingof
an imperceptible frequency of audio files.
• Some data formats offer extra space for the storage of additional
information.
Location (cont.)
• Some encapsulation tools offer security features, like encryption.
Security
• If encryption requires a secret key for accessing the data, there is
high risk potential of losing the data by losing the key.
• The preservation and re-use of confidential objects or encapsulated
information requires adequate prudence. For this purpose an
encryption makes sense.
• The confidentiality of steganographic methods is based on the
retention of knowledge or authorisation, if no additional encryption is
used. Insofar this constitutes a very weak kind of confidentiality.
• Authenticity and integrity of the digital object and its environment
information are paramount for many usages.
• Authenticity can be important if the digital object has special legal
requirements.
• Fragile or semi-fragile digital watermarks can be used in some
cases to ensure the integrity of a delivery copy of an object, hereby
the object is changed slightly by the application of the watermarking
algorithm.
• The marking would be destroyed, if the file is altered, thereby an
intact mark can ensure that no third party changed the object.
• Authenticity plays a major role in the archive context in which also
the provenance and chain of custody of an object are important.
Authenticity
• To guarantee the integrity of a digital object, it is often kept apart in
its original state and context, all changes to the original are omitted.
Integrity
• The BagIt directory structure can be used without applying an
additional packaging or compression method to prevent object
alterations.
• Metadata is added into other defined directories of the structure, so
that the digital object remains untouched, even by complementing
information at a future date.
• The integrity of the encapsulated information can be verified by
adding the checksum of their originals to the restoration metadata.

More Related Content

Similar to PERICLES - Choice of Information Encapsulation (IE) Technique

PERICLES Information Embedding Techniques
PERICLES Information Embedding TechniquesPERICLES Information Embedding Techniques
PERICLES Information Embedding TechniquesPERICLES_FP7
 
PERICLES Decapsulation and Restoration
PERICLES Decapsulation and RestorationPERICLES Decapsulation and Restoration
PERICLES Decapsulation and RestorationPERICLES_FP7
 
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSFAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSMaurvi04
 
Geoscientific Data Management Principles
Geoscientific Data Management PrinciplesGeoscientific Data Management Principles
Geoscientific Data Management PrinciplesNigelLaubsch
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overviewdublinx
 
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...Prasadu Peddi
 
Digital Media Storage.pptx
Digital Media Storage.pptxDigital Media Storage.pptx
Digital Media Storage.pptxLydiahkawira1
 
Presentation (6).pptx
Presentation (6).pptxPresentation (6).pptx
Presentation (6).pptxMSMuthu5
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectATMOSPHERE .
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Privacy preserving public auditing for regenerating-code-based
Privacy preserving public auditing for regenerating-code-basedPrivacy preserving public auditing for regenerating-code-based
Privacy preserving public auditing for regenerating-code-basedNagamalleswararao Tadikonda
 
IRJET- A Review Paper on an Efficient File Hierarchy Attribute Based Encr...
IRJET-  	  A Review Paper on an Efficient File Hierarchy Attribute Based Encr...IRJET-  	  A Review Paper on an Efficient File Hierarchy Attribute Based Encr...
IRJET- A Review Paper on an Efficient File Hierarchy Attribute Based Encr...IRJET Journal
 
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...ATMOSPHERE .
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud EnablementDocuLynx
 
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...IJCSIS Research Publications
 
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical Review
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical ReviewIRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical Review
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical ReviewIRJET Journal
 

Similar to PERICLES - Choice of Information Encapsulation (IE) Technique (20)

PERICLES Information Embedding Techniques
PERICLES Information Embedding TechniquesPERICLES Information Embedding Techniques
PERICLES Information Embedding Techniques
 
Deduplication in Open Spurce Cloud
Deduplication in Open Spurce CloudDeduplication in Open Spurce Cloud
Deduplication in Open Spurce Cloud
 
PERICLES Decapsulation and Restoration
PERICLES Decapsulation and RestorationPERICLES Decapsulation and Restoration
PERICLES Decapsulation and Restoration
 
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSFAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
 
Geoscientific Data Management Principles
Geoscientific Data Management PrinciplesGeoscientific Data Management Principles
Geoscientific Data Management Principles
 
Mind Map Test Data Management Overview
Mind Map Test Data Management OverviewMind Map Test Data Management Overview
Mind Map Test Data Management Overview
 
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
 
Digital Media Storage.pptx
Digital Media Storage.pptxDigital Media Storage.pptx
Digital Media Storage.pptx
 
Presentation (6).pptx
Presentation (6).pptxPresentation (6).pptx
Presentation (6).pptx
 
Software Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE projectSoftware Defined Networking in the ATMOSPHERE project
Software Defined Networking in the ATMOSPHERE project
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Privacy preserving public auditing for regenerating-code-based
Privacy preserving public auditing for regenerating-code-basedPrivacy preserving public auditing for regenerating-code-based
Privacy preserving public auditing for regenerating-code-based
 
IRJET- A Review Paper on an Efficient File Hierarchy Attribute Based Encr...
IRJET-  	  A Review Paper on an Efficient File Hierarchy Attribute Based Encr...IRJET-  	  A Review Paper on an Efficient File Hierarchy Attribute Based Encr...
IRJET- A Review Paper on an Efficient File Hierarchy Attribute Based Encr...
 
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...
Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE P...
 
Chap1 slides
Chap1 slidesChap1 slides
Chap1 slides
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
Intelligent Cloud Enablement
Intelligent Cloud EnablementIntelligent Cloud Enablement
Intelligent Cloud Enablement
 
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...
Enhanced Hybrid Blowfish and ECC Encryption to Secure Cloud Data Access and S...
 
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical Review
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical ReviewIRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical Review
IRJET- Reversible Data Hiding using Histogram Shifting Method: A Critical Review
 

More from PERICLES_FP7

Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17PERICLES_FP7
 
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...PERICLES_FP7
 
Technical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshopTechnical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshopPERICLES_FP7
 
ForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information ModelForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information ModelPERICLES_FP7
 
Data quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspectiveData quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspectivePERICLES_FP7
 
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...PERICLES_FP7
 
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...PERICLES_FP7
 
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016PERICLES_FP7
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangePERICLES_FP7
 
Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...PERICLES_FP7
 
Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016PERICLES_FP7
 
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES_FP7
 
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...PERICLES_FP7
 
Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016PERICLES_FP7
 
Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016PERICLES_FP7
 
Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...PERICLES_FP7
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES_FP7
 
PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016PERICLES_FP7
 
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016PERICLES_FP7
 
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...PERICLES_FP7
 

More from PERICLES_FP7 (20)

Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17
 
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
 
Technical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshopTechnical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshop
 
ForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information ModelForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information Model
 
Data quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspectiveData quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspective
 
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
 
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
 
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
 
Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...
 
Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016
 
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
 
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
 
Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016
 
Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016
 
Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...
 
PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016
 
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
 
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

PERICLES - Choice of Information Encapsulation (IE) Technique

  • 1. GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3 Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation] “This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no601138”. Choice of IE Technique Anna-Grit Eggers (University of Goettingen)
  • 3. • IE techniques cover a wide range of uses which differ regarding the: •processing velocity •required disk space •location of storage •accessibility and perceptibility of the payload (by human / by machine analysis) •preservation level of the carrier / digital object •processability of digital object and payload file formats and file sizes •provided compression mechanisms. Range of uses
  • 4. • We identified criteria to distinguish between the techniques. Criteria • An encapsulation technique that fits for a specific use scenario can be chosen based on the technique specific characteristics of these criteria. • Definition of criterion in this context: A property or feature of information encapsulation techniques that can be used to compare different techniques on the basis of the criterion characteristics. • For example: •Robustness of encapsulated information after encapsulation with an algorithm towards processing of the carrier. •Perceptibility of the encapsulated information by an observation of the carrier.
  • 5. Examples of characteristics for this example: • The characteristic of a technique for the criterion “Robustness” can be: • “robust” (“true”) • “not robust” (“false”) • The characteristic of a technique for the criterion “Perceptibility” can be: • “visible” • “not perceptible by humans” • “detectable by computers” • “not perceptible at all”. Criterion characteristics • The assignment for this criterion is harder, because the threshold for “human perceptibility” or “computer detectability” is blurred and not a “true/false” value.
  • 6.
  • 8. • IE techniques are used for a specific purpose in the context of a use scenario. • This usage scenario defines features and characteristics that are desired to be fulfilled by a potential IE technique. • The overall task is to find the best IE technique by capturing and evaluating the scenario defined by the user. • The user could aim for encapsulating messages or metadata with digital files. • The aim could also be to add legal information, ownership information, corporate designs, or information to ensure the authenticity of a digital object. Creating a scenario
  • 9. • Some need a visible payload, others prefer to hide it. • The most valuable information might be the digital object or the payload itself (this is often the case with steganographic messages). Creating a scenario (cont.)
  • 10. • Three procedures are required to implement the scenario: •scenario capturing •weighting •decision calculation mechanism Implementing a scenario • Capturing: a questionnaire which requests the importance of a set of scenario criteria for the given scenario. The criteria are chosen in a way that they can be mapped to the features of the IE techniques. • The amount of investigated criteria correlates with the amount of available IE techniques: It should be high enough to be able to distinguish between all main techniques, but low enough that the user won’t be overwhelmed while filling the questionnaire.
  • 11. • Weighting: Another crucial aspect is to ask the user • which of the criteria are important for the scenario • how important they are, • which should be excluded because not pertinent. Implementing a scenario • The Analytic Hierarchy Process is a sophisticated but complex method: criteria are compared to each other and the user has to decide for each comparison which criterion is the more important one. • A simpler approach is to include an option to exclude unimportant criteria and add a weighting mechanism for the user to indicate how desirable a characteristic of a criterion is, or how important it is for the scenario.
  • 12. ● There are two types of criteria to consider: Decision criteria • must be fulfilled to be able to use a specific algorithm. • File formats are an example of a technical criterion, because some algorithms can only be used for specific file formats. Technical criteria • depend on a usage scenario or the user preferences. Scenario criteria
  • 13. Technical criteria: ◦ File formats (for carrier files as well as the payload files) ◦ Number of files ◦ Capacity Technical decision criteria
  • 14. File formats • Embedding algorithms usually supports a set of file formats and cannot be used on files with the wrong formats. • For packaging, the metadata has to be mapped to one of the standard XML packaging formats. The created packet reduces the risk of data loss. Technical decision criteria Number of files • A growing number of files increases the risk of losing one of them. • With more than one file, the files can facilitate the identification process of the belonging files by providing indications for the used formats. That reduces the impact of file format obsolescence.
  • 15. Capacity • Capacity is the message size constraint: the number of payload bits that can be embedded by an embedding algorithm into a specific digital object. Technical criteria (cont.) • It is influenced by the data format and used method. Some methods increase the risk of damaging the carrier file, or the payload becomes visible if the message size is too big. • While packaging methods have no limit for the size of the payload files, embedding methods mostly have a maximum payload size. • Invisible watermarking and steganography embedding methods not only become visible, if the payload size is too high, the cost for algorithmic calculations will also strongly increase with the size of the payload. • The use of an information frame can scale theoretically well for big payload file size. Though it might be unproductive, if the data frame outsizes the original digital object. In such a case the use of a packaging method can be considered
  • 16. ● Processability and robustness ● Complexity (space/time) of the algorithm ● Used disk space of the output ● Restorability of the carrier ● Risk of data loss ● Perceptibility ● Location of the encapsulated information ● Spreading, standardization ● Security, confidentiality ● Authenticity Scenario criteria
  • 17. • To be re-usable, digital objects need to be processable normally by applications unhindered by its encapsulated information. Processability • Packaging techniques might require unpacking before processing the digital object, thus consuming additional calculation power. • Embedding techniques do not change the file format of a digital object which therefore can usually be processed directly. • A method that allows for encapsulated information to survive processing steps is considered “robust”. • In an ideal case the embedded metadata survives even file format conversions.
  • 18. • Robustness can be strong or weak. Weakness implies an additional extraction step to keep the metadata safe. Robustness • In a scenario where the digital object is frequently viewed and processed, the metadata has to be embedded with a robust method. • Steganographic methods are often very robust: • they take an attacker into account. • the digital objects can be processed normally, because a usage restriction would betray the hidden messages. • Visible digital watermarks can be very robust. • Imperceptible watermarks are often fragile or semi fragile, to allow the recognition of authenticity violations. • The use of available metadata fields is a very robust method that allows even object conversions. This can be valid for information frames, too.
  • 19. Parameters for calculating costs for algorithmic calculations and resources for the encapsulation process: Complexity of algorithms ● The time and space requirements related to the complexity of the algorithm used for the encapsulation and recovery of the original digital object and the environment information ● This includes costs for validation calculations and for the unpacking algorithms of packaging strategies. ● Big O notation can help to express the algorithms behaviour in relation to the embedding payload size: http://web.mit.edu/16.070/www/lecture/big_o.pdf ● Frequent use a digital object and its metadata requires faster the restoration time.
  • 20. • The time for decompression has to be added if the data was compressed. • Packaging mostly needs a lot more time for this than embedding. • With robust methods, an extraction is not necessary before the reuse of the object. • Edition of available metadata fields is integrated in many programs, and therefore not very time intensive. • The extension of an information frame in itself is not necessarily time intensive, whereas the embedding method used on the frame might. Complexity of algorithms (cont)
  • 21. • The difference of disk space needed for the enriched digital object in contrast to the original digital object can be an important parameter for preserving a large amount of data. Disk space requirements • Some methods offer the possibility of compressing the data, so that disk space can be saved. • Packaging container compress both payload and digital object. • Embedding methods offering compression only compress the embedded metadata and not the carrier. • Packaging can save more disk space than embedding with compression. • An integrity check for possible damage during compression. • Compression, decompression and validation require extra calculation time
  • 22. • Embedding methods mostly do not need much extra disk space. Disk space requirements (cont.) • With steganography algorithms changing only single bits, the size of the digital object remains constant. This method, however limits the capacity. • The use of available metadata fields doesn’t need much disk space. Compression can extend the capacity for these methods. • Information frames need additional disk space proportional to the size of the metadata files that should be stored.
  • 23. • The encapsulation method has to ensure that the digital object and the metadata can be restored. • The digital object has to be restored in its original state unscathed. • The integrity has to be verified by checksum comparisons. • There are different levels of integrity, e.g. just to ensure that the significant properties survive, or a bit exact replica. • The significant properties have to survive in any case. • A validation of the whole object is often simpler than the validation of the significant properties. • It is highly improbably that packaging damages the digital object. A validation is easy, if the checksum was added to the metadata file. Restoration
  • 24. • Not all algorithms that are used for embedding are completely reversible. • Reversible embedding algorithms often embed the information of how to reverse them into a defined location of the digital object. • If metadata is converted in the encapsulation, by example by compression, encryption, or format conversion, it might be necessary to validate the metadata. • The methods using available metadata fields or information frames offer easy restoration. Restoration (cont.)
  • 25. • The following factors increase the risk of damage for the digital object or the metadata: •encryption usage •information hiding •compression •processing •conversion of the digital object Risk of data loss • Packaging stores the metadata in separate files. This guarantees access to embedded information which in turn may help identify the related digital data. • Data containers used for packaging mostly have standard formats that are not as vulnerable as non-standardised formats. • At the same time the risk of data loss is higher for separated files when unpacking. • For some embedding methods object modifications are inevitable.
  • 26. • The term ’Data Hiding’ describes methods to embed information in a way it is not perceptible by humans. • Steganography and invisible digital watermarks are mostly detectable by machines. • For most preservation scenarios it is necessary to be able to detect the encapsulated information. • Data hiding increases the risk of losing the knowledge about the existence of this data. • To avoid this, the carrier can be tagged with a visible method. • Packaging is always visible, whereas steganographic methods are usually invisible. Visibility
  • 27. • The location where the metadata is encapsulated can be a decision criterion: •a separate file •the exact location at a file •the time dimension of an audio file •the background noise. Location • The location of storage is the main difference between packaging and embedding methods: •Packaging stores the information in a separated file, mostly in a standardised XML format. •Embedding stores the environment information directly in the digital object.
  • 28. • The embedding into the background noise has no influence on the significant properties and doesn't need additional disk space. • Therefore, the noise has to be clearly identifiable, to prevent damage of the digital object. • Using available metadata fields, or an information frame, do not influence the significant properties of the digital object directly. • Some embedding methods store information by changing elements of the object, e.g. by inverting single bits of an image pixel, or by usingof an imperceptible frequency of audio files. • Some data formats offer extra space for the storage of additional information. Location (cont.)
  • 29. • Some encapsulation tools offer security features, like encryption. Security • If encryption requires a secret key for accessing the data, there is high risk potential of losing the data by losing the key. • The preservation and re-use of confidential objects or encapsulated information requires adequate prudence. For this purpose an encryption makes sense. • The confidentiality of steganographic methods is based on the retention of knowledge or authorisation, if no additional encryption is used. Insofar this constitutes a very weak kind of confidentiality.
  • 30. • Authenticity and integrity of the digital object and its environment information are paramount for many usages. • Authenticity can be important if the digital object has special legal requirements. • Fragile or semi-fragile digital watermarks can be used in some cases to ensure the integrity of a delivery copy of an object, hereby the object is changed slightly by the application of the watermarking algorithm. • The marking would be destroyed, if the file is altered, thereby an intact mark can ensure that no third party changed the object. • Authenticity plays a major role in the archive context in which also the provenance and chain of custody of an object are important. Authenticity
  • 31. • To guarantee the integrity of a digital object, it is often kept apart in its original state and context, all changes to the original are omitted. Integrity • The BagIt directory structure can be used without applying an additional packaging or compression method to prevent object alterations. • Metadata is added into other defined directories of the structure, so that the digital object remains untouched, even by complementing information at a future date. • The integrity of the encapsulated information can be verified by adding the checksum of their originals to the restoration metadata.