An Introduction to
PREMIS
Jenn Riley
Metadata Librarian
IU Digital Library Program
2/7/2007 Digital Library Brown Bag Series 2
Outline
 Background and context
 PREMIS data model
 PREMIS data dictionary
...
2/7/2007 Digital Library Brown Bag Series 3
What is preservation metadata?
 Information that “supports and documents the ...
2/7/2007 Digital Library Brown Bag Series 4
Some related work
 National Library of Australia
Preservation Metadata for Di...
2/7/2007 Digital Library Brown Bag Series 5
What is PREMIS?
 PREservation Metadata Implementation
Strategies
 A working ...
2/7/2007 Digital Library Brown Bag Series 6
Charge to the PREMIS working group
 define an implementable set of “core” pre...
2/7/2007 Digital Library Brown Bag Series 7
Working group structure
 Implementation Strategies Subgroup
 “examined vario...
2/7/2007 Digital Library Brown Bag Series 8
How PREMIS defines preservation
metadata
 “The information a repository uses ...
2/7/2007 Digital Library Brown Bag Series 9
PREMIS goals
 Build on the OAIS reference model
 Be implementation independe...
2/7/2007 Digital Library Brown Bag Series 10
Development strategies
 Paid particular attention to documenting
 digital p...
2/7/2007 Digital Library Brown Bag Series 11
Defining “core”
 “Things that most working preservation repositories
are lik...
2/7/2007 Digital Library Brown Bag Series 12
Outline
 Background and context
 PREMIS data model
 PREMIS data dictionary...
2/7/2007 Digital Library Brown Bag Series 13
The PREMIS data model
2/7/2007 Digital Library Brown Bag Series 14
Intellectual entities
 “A coherent set of content that is reasonably
describ...
2/7/2007 Digital Library Brown Bag Series 15
Objects
 “A discrete unit of information in digital
form”
 Is a static set ...
2/7/2007 Digital Library Brown Bag Series 16
File object
 “A named and ordered sequence of bytes that
is known by an oper...
2/7/2007 Digital Library Brown Bag Series 17
Bitstream object
 “Contiguous or non-contiguous data within a file
that has ...
2/7/2007 Digital Library Brown Bag Series 18
Representation object
 “The set of files, including structural
metadata, nee...
2/7/2007 Digital Library Brown Bag Series 19
Objects example: ETD
2/7/2007 Digital Library Brown Bag Series 20
Events
 The Event entity aggregates metadata about actions
that involve at l...
2/7/2007 Digital Library Brown Bag Series 21
Agents
 “A person, organization, or software program
associated with preserv...
2/7/2007 Digital Library Brown Bag Series 22
Rights
 Rights Statements “are assertions of one or more rights
or permissio...
2/7/2007 Digital Library Brown Bag Series 23
Relationships between entities
 Between objects
 Structural relationships
...
2/7/2007 Digital Library Brown Bag Series 24
Outline
 Background and context
 PREMIS data model
 PREMIS data dictionary...
2/7/2007 Digital Library Brown Bag Series 25
The PREMIS data dictionary
 Defines semantic units for:
 Objects
 Events
...
2/7/2007 Digital Library Brown Bag Series 26
Entries include information on:
 Name
 Semantic components
 Definition
 R...
2/7/2007 Digital Library Brown Bag Series 27
Sample data dictionary entry
2/7/2007 Digital Library Brown Bag Series 28
Outline
 Background and context
 PREMIS data model
 PREMIS data dictionary...
2/7/2007 Digital Library Brown Bag Series 29
Role of a preservation policy
 PREMIS helps a repository to implement a
pres...
2/7/2007 Digital Library Brown Bag Series 30
Relationship to technical metadata
 PREMIS semantic units restricted to:
 i...
2/7/2007 Digital Library Brown Bag Series 31
Other types of metadata
 Structural and rights metadata fall at least
partly...
2/7/2007 Digital Library Brown Bag Series 32
Lack of relevant content standards and
controlled vocabularies
 Only in some...
2/7/2007 Digital Library Brown Bag Series 33
PREMIS conformance
 “Local metadata can be used to extend but not
modify the...
2/7/2007 Digital Library Brown Bag Series 34
XML Schemas
 Literal representations of the semantic units and
attributes of...
2/7/2007 Digital Library Brown Bag Series 35
Outline
 Background and context
 PREMIS data model
 PREMIS data dictionary...
2/7/2007 Digital Library Brown Bag Series 36
Adoption
 Hard to tell, since preservation repositories operate
behind the s...
2/7/2007 Digital Library Brown Bag Series 37
Current PREMIS activity
 PREMIS Maintenance Activity hosted at the
Library o...
2/7/2007 Digital Library Brown Bag Series 38
So where are we?
 The data dictionary looks to be having a big
impact
 Disc...
2/7/2007 Digital Library Brown Bag Series 39
For more information
 PREMIS Working Group site:
<http://www.oclc.org/resear...
Upcoming SlideShare
Loading in …5
×

An Introduction to PREMIS

594 views

Published on

Riley, Jenn. "An introduction to PREMIS." Digital Library Program Brown Bag Presentation, February 7, 2007.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
594
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Cedars (CURL Exemplars in Digital ARchives) Project ran from April 1998 until March 2002.
    Funded by JISC (the Joint Information Systems Committee of the UK higher education funding councils), as part of its Electronic Libraries (eLib) Programme
  • for example, a particular book, map, photograph, or database
  • Example: multi-page TIFF, audio and video segments of an MPEG file
    Bitstream becomes a file with the addition of a header, decompression, etc.
    Tar -&amp;gt; TIFF -&amp;gt; image as several levels of bitstreams
  • The goal of many preservation repositories is to maintain usable versions of intellectual entities over time.
    However, not all preservation repositories will be concerned with representations. A repository might, for example, preserve file objects only and rely on external agents to assemble these objects into usable representations. If the repository does not manage representations, it does not need to record metadata about them.
  • This example is an electronic dissertation (ETD) stored in a repository called the Florida Digital Archive (FDA). The ETD consists of
    two files: a PDF and an MP3, which is actually an appendix to the dissertation. Because MP3 is not a format “preferred” by the
    repository, the FDA creates a WAVE file from the MP3.
    On ingest, the FDA will create three Object descriptions: one for the representation (the dissertation as a whole), one for the PDF, and
    one for the MP3. When the WAVE file is created, the FDA creates an Object description for the new WAVE file and an object
    description for the new representation, which consists of the original PDF and the WAVE.
  • Some repositories might record events that happened before ingestion into a repository
  • Permission is defined as “an agreement between the rights holder and the repository, allowing the repository to undertake some action”
  • Technical metadata describes the physical rather than intellectual
    characteristics of digital objects. Detailed, format-specific technical metadata is clearly necessary
    for implementing most preservation strategies, but the group had neither the time nor the
    expertise to tackle format-specific technical metadata for various types of digital files. Therefore,
    it restricted the technical metadata included in the Data Dictionary to the semantic units it
    believed apply to objects in all formats. Further development of technical metadata is left to
    format experts.
  • &amp;apos;significantProperties&amp;apos; and &amp;apos;preservationLevel&amp;apos;
  • An Introduction to PREMIS

    1. 1. An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program
    2. 2. 2/7/2007 Digital Library Brown Bag Series 2 Outline  Background and context  PREMIS data model  PREMIS data dictionary  Implementing PREMIS  Adoption and ongoing developments
    3. 3. 2/7/2007 Digital Library Brown Bag Series 3 What is preservation metadata?  Information that “supports and documents the digital preservation process”  The problem is, we don’t really know what that metadata looks like  Well, we know something, but not nearly enough  The community is thinking hard about the issues, but we still have very little real-world data
    4. 4. 2/7/2007 Digital Library Brown Bag Series 4 Some related work  National Library of Australia Preservation Metadata for Digital Collections (October 1999)  Reference Model for an Open Archival Information System (OAIS (June 2001)  RLG/OCLC report Trusted Digital Repositories: Attributes and Responsibilities (May 2002)  OCLC/RLG Metadata Framework to Support the Preservation of D (June 2002)  National Library of New Zealand Metadata Standards Framework –Preservation Metadata (November 2002)  RLG/NARA Audit Checklist for the Certification of Trusted Digital Repositories (August 2005)
    5. 5. 2/7/2007 Digital Library Brown Bag Series 5 What is PREMIS?  PREservation Metadata Implementation Strategies  A working group of over 30 members sponsored by OCLC and RLG  A data dictionary for preservation metadata included in the May 2005 final report of the working group
    6. 6. 2/7/2007 Digital Library Brown Bag Series 6 Charge to the PREMIS working group  define an implementable set of “core” preservation metadata elements, with broad applicability within the digital preservation community;  draft a Data Dictionary to support the core preservation metadata element set;  examine and evaluate alternative strategies for the encoding, storage, and management of preservation metadata within a digital preservation system, as well as for the exchange of preservation metadata among systems;  conduct pilot programs for testing the group’s recommendations and best practices in a variety of systems settings; and  explore opportunities for the cooperative creation and sharing of preservation metadata.
    7. 7. 2/7/2007 Digital Library Brown Bag Series 7 Working group structure  Implementation Strategies Subgroup  “examined various strategies for encoding, storing, and managing preservation metadata within digital preservation systems”  performed survey of existing and planned digital preservation systems  Core Elements Subgroup  defined core elements  drafted data dictionary
    8. 8. 2/7/2007 Digital Library Brown Bag Series 8 How PREMIS defines preservation metadata  “The information a repository uses to support the digital preservation process”  Metadata that supports  viability  renderability  understandability  authenticity  identity  Mandatory elements represent “the minimum amount for [a] second repository to accept custody of [a] digital object and assume responsibility for its long-term preservation”
    9. 9. 2/7/2007 Digital Library Brown Bag Series 9 PREMIS goals  Build on the OAIS reference model  Be implementation independent  “Provide a starting point for improvements and enhancements based on community experience and feedback”
    10. 10. 2/7/2007 Digital Library Brown Bag Series 10 Development strategies  Paid particular attention to documenting  digital provenance  relationships  “Whenever possible the group defined elements that do not require human intervention to supply or analyze,” but did not limit to these  Defined “semantic units” rather than “metadata elements”
    11. 11. 2/7/2007 Digital Library Brown Bag Series 11 Defining “core”  “Things that most working preservation repositories are likely to need to know in order to support digital preservation”  “Core does not necessarily mean mandatory”  “Core elements define information that a repository needs to know, regardless of how, or even whether, that information is stored”  Core elements support checking of:  Fixity – object is unchanged since some previous time  Integrity – compliant with relevant specifications  Authenticity – object is what it purports to be
    12. 12. 2/7/2007 Digital Library Brown Bag Series 12 Outline  Background and context  PREMIS data model  PREMIS data dictionary  Implementing PREMIS  Adoption and ongoing developments
    13. 13. 2/7/2007 Digital Library Brown Bag Series 13 The PREMIS data model
    14. 14. 2/7/2007 Digital Library Brown Bag Series 14 Intellectual entities  “A coherent set of content that is reasonably described as a unit”  Can include other Intellectual Entities  May have one or more digital representations  May not be managed by all repositories
    15. 15. 2/7/2007 Digital Library Brown Bag Series 15 Objects  “A discrete unit of information in digital form”  Is a static set of bits that cannot be modified  Three subtypes  File  Bitstream  Representation
    16. 16. 2/7/2007 Digital Library Brown Bag Series 16 File object  “A named and ordered sequence of bytes that is known by an operating system”  Defined like “file” in common usage  No restriction on format, etc.
    17. 17. 2/7/2007 Digital Library Brown Bag Series 17 Bitstream object  “Contiguous or non-contiguous data within a file that has meaningful common properties for preservation purposes”  Defined differently than common usage  Can’t span files  Must have some sort of reformatting to be made into a file  Weird exception: filestreams  Bitstreams that don’t need additional information to be transformed into a file  Follow all the file rules in the data dictionary, not the bitstream rules
    18. 18. 2/7/2007 Digital Library Brown Bag Series 18 Representation object  “The set of files, including structural metadata, needed for a complete and reasonable rendition of an Intellectual Entity”  More than one Representation may exist for each Intellectual Entity  Repository doesn’t necessarily have to track representations
    19. 19. 2/7/2007 Digital Library Brown Bag Series 19 Objects example: ETD
    20. 20. 2/7/2007 Digital Library Brown Bag Series 20 Events  The Event entity aggregates metadata about actions that involve at least one object or agent known to the preservation repository  Many types of Events might be of interest to a preservation repository  Creation of a new version of an object  Create/alter relationships  Validity/integrity checking  etc.  All Events have outcomes  Some Events have outputs
    21. 21. 2/7/2007 Digital Library Brown Bag Series 21 Agents  “A person, organization, or software program associated with preservation events in the life of an object”  Agents represented minimally in PREMIS  Means of identification  Classification as person, organization, or software  Assumes other initiatives will more fully define Agents  Agents influence Objects only indirectly through Events
    22. 22. 2/7/2007 Digital Library Brown Bag Series 22 Rights  Rights Statements “are assertions of one or more rights or permissions pertaining to an Object and/or Agent”  Semantic units related to rights restricted to those concerned with preservation activities  All expressible as “Agent A grants this permission for Object B.”  3 semantic units  allowed act  expiration date of the permission  all other terms, conditions, restrictions and/or limitations  Acknowledges much more work needs to be done
    23. 23. 2/7/2007 Digital Library Brown Bag Series 23 Relationships between entities  Between objects  Structural relationships  Derivation relationships  Dependency relationships  Others defined by data model indicated in data dictionary by linking attributes
    24. 24. 2/7/2007 Digital Library Brown Bag Series 24 Outline  Background and context  PREMIS data model  PREMIS data dictionary  Implementing PREMIS  Adoption and ongoing developments
    25. 25. 2/7/2007 Digital Library Brown Bag Series 25 The PREMIS data dictionary  Defines semantic units for:  Objects  Events  Agents  Rights  Intellectual Entity is out of scope because it is “well served by descriptive metadata”
    26. 26. 2/7/2007 Digital Library Brown Bag Series 26 Entries include information on:  Name  Semantic components  Definition  Rationale  Data constraint  Object category  Applicability  Examples  Repeatability  Obligation  Creation/Maintenance notes  Usage notes
    27. 27. 2/7/2007 Digital Library Brown Bag Series 27 Sample data dictionary entry
    28. 28. 2/7/2007 Digital Library Brown Bag Series 28 Outline  Background and context  PREMIS data model  PREMIS data dictionary  Implementing PREMIS  Adoption and ongoing developments
    29. 29. 2/7/2007 Digital Library Brown Bag Series 29 Role of a preservation policy  PREMIS helps a repository to implement a preservation policy; it doesn’t set that policy  Policy can be complicated  Is descriptive metadata part of an Intellectual Entity?  If so, should we treat it as a file?  Is PREMIS data itself a file (or a bitstream) that is managed by the repository?  etc., ad infinitum…  The data dictionary is only a starting point, does not include all information needed to preserve an Object
    30. 30. 2/7/2007 Digital Library Brown Bag Series 30 Relationship to technical metadata  PREMIS semantic units restricted to:  intellectual characteristics  characteristics common to all formats  Some overlap between PREMIS semantic units and elements defined by various technical metadata standards
    31. 31. 2/7/2007 Digital Library Brown Bag Series 31 Other types of metadata  Structural and rights metadata fall at least partly within the scope of PREMIS, but perhaps not entirely  Descriptive metadata is useful for defining Intellectual Entities, which are managed by some repositories
    32. 32. 2/7/2007 Digital Library Brown Bag Series 32 Lack of relevant content standards and controlled vocabularies  Only in some cases do definitions of semantic elements provide guidance on how to structure the data recorded  PREMIS semantic units largely outside the scope of most existing content standards  “PREMIS assumes that repositories will adopt or define controlled vocabularies useful to them”  Perhaps a common content standard isn’t needed, but the lack of one does mean more decisions have to be made when implementing a repository
    33. 33. 2/7/2007 Digital Library Brown Bag Series 33 PREMIS conformance  “Local metadata can be used to extend but not modify the PREMIS semantic units”  “The mandatory semantic units of the Data Dictionary represent the information that a preservation repository must be able to associate with any archived digital object in its possession”  Don’t have to store PREMIS information, just have to know it  Currently no formal means of stating or testing “conformance”
    34. 34. 2/7/2007 Digital Library Brown Bag Series 34 XML Schemas  Literal representations of the semantic units and attributes of the PREMIS data dictionary  Of use for exchange of preservation objects  Likely of less use for a repository’s internal representation  5 separate schemas  PREMIS container  Object entity  Event entity  Agent entity  Rights entity
    35. 35. 2/7/2007 Digital Library Brown Bag Series 35 Outline  Background and context  PREMIS data model  PREMIS data dictionary  Implementing PREMIS  Adoption and ongoing developments
    36. 36. 2/7/2007 Digital Library Brown Bag Series 36 Adoption  Hard to tell, since preservation repositories operate behind the scenes  PREMIS Implementation Registry currently has 8 diverse entries  Active implementer's discussion list  Working Group won the 2005 Digital Preservation Coalition Digital Preservation Award  Several PREMIS workshops scheduled
    37. 37. 2/7/2007 Digital Library Brown Bag Series 37 Current PREMIS activity  PREMIS Maintenance Activity hosted at the Library of Congress  Editorial Committee named  Commissioned report on Rights in the PREMIS Data Model  Proposals for revisions of two semantic units in public comment period
    38. 38. 2/7/2007 Digital Library Brown Bag Series 38 So where are we?  The data dictionary looks to be having a big impact  Discussion of preservation metadata is increasing  It seems the PREMIS goal of a “starting point” has been well fulfilled  PREMIS looks promising as a source of ideas for the IU DLP preservation repository
    39. 39. 2/7/2007 Digital Library Brown Bag Series 39 For more information  PREMIS Working Group site: <http://www.oclc.org/research/projects/pmwg/default.htm>  PREMIS Maintenance Activity site: <http://www.loc.gov/standards/premis/>  PREMIS Implementors Listserv: <http://listserv.loc.gov/listarch/pig.html>  These presentation slides: <http://www.dlib.indiana.edu/~jenlrile/presentations/bbspr07/premis/premis.ppt>  jenlrile@indiana.edu

    ×