StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Structural Metadata in RDF
IS 575: Metadata in Theory & Practice
University of Illinois, Urbana-Champaign
Dr. Robert Sanderson
10/20/2020
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Structural Metadata?
• Descriptive:
About the intellectual resource
• Technical:
About the digital resource
• Administrative:
Access, Rights, Provenance
• Meta:
Metadata about the metadata record
• Structural:
Set Membership / Entity Partitioning
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Structural Metadata in RDF
• has_member / member_of
• has_part / part_of
Thank you!
Questions?
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Overview
• Structure is more than just membership/partitioning
• OAI - Object Reuse and Exchange
• Portland Common Data Model
• Web Annotation Model
• IIIF Presentation API
• Linked Open Usable Data / Linked.Art
• Conclusions
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges Raised by RDF
• Open World vs Local Structure:
• Order of entities
• Context-specific descriptive metadata
• Usability:
• Graph Boundaries
• Representation vs Resource
• API Interactions
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Why Me?
Feel some degree of responsibility / blame:
• OAI-ORE: Editor
• PCDM: “Committer” (basically Editor)
• Web Annotation: Editor
• IIIF: Editor
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE
http://www.openarchives.org/ore/1.0/datamodel
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Background
• Mellon Foundation grant 2006-2008
• Digital Library and Scholarly Communication focus
• Context:
• Aligning existing work: PMH, METS etc
• With the web: URIs, REST, Linked Open Data
• For interoperability of digital libraries:
• Scholarly communication
• Digital objects
• Research outputs
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Basic Structure
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Basic Structure
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Basic Structure
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
OAI-ORE Basic (?) Structure
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Graph Boundary & API
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Contextual Information
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Ordering
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Context Specific Descriptive Metadata
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Proxies Considered Harmful
Proxies are a Usability nightmare!
• Now two places to look for all metadata
• Range/Domain inferences are out the window
• Can’t validate an application profile, as proxies are the
union of all other classes
• Can’t create a database structure other than triples
But at least they’re optional … no one will use them…
Right???
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
PCDM
• Portland Common Data Model
• Hosted by U/Oregon 11/2014, 2/2015
• Intent to produce common DL model
• Interoperability! Yay!
• Mostly RDF based systems – Fedora, etc.
• Simple as possible to ensure coverage
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
White Stag Data Model (WSDM)
https://www.pdxmonthly.com/producers/courtesy-historic-photo-archive-hugh-ackroyd
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
PCDM Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
PCDM Ordering Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Alternative: ItemList / ListItem
See: https://schema.org/ListItem
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges: Model / Semantics
• Order via Proxies
• Unclear semantics of Collection / Object
• Both related via hasMember
• Specialization / classification by subclassing
• Leads to proliferation of classes
• And greatly reduced interoperability
• Not opinionated enough where it was needed
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges: Graph Boundaries
• Direction of relationships is from the whole to the part
• DL collections and objects can have MANY parts, making for
very long representations, especially in RDF/XML
• Same issue exists for ResourceMap / Aggregation
• Consider: digitized copy of War and Peace
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
API – Linked Data Platform
• PCDM was developed in the context of the then new
Linked Data Platform Specification
https://www.w3.org/TR/ldp/
• Attempted to provide C/R/U/D specification for LOD
• REST-based (HTTP POST, GET, PUT, DELETE methods)
• Implemented in Fedora4 (and lots of others)
• Suffered from lack of clear vision in W3C WG – ended up
trying to meet competing goals
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
API – Linked Data Platform
• Did not solve core challenges, leaving implementations
either not interoperable, or not functional
• Authentication (needed for write operations)
• Paging of large resources (c.f. downward relationships)
• Graph boundary conditions
• Did introduce useful notion of “containers”:
• Writing to a container could create additional triples
• Containers were resources, configured in triples
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation
As LDP was finishing, Web Annotation WG was starting...
• 2001 Annotea
• 2009 Open Annotation Collaboration &
Annotation Ontology
• 2011 Open Annotation Community Group
• 2014 Web Annotation Working Group
• 2017 Technical Recommendations
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Model
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Web Annotation Workflow
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Avoided Challenges
Used Motivation vocabulary instead of subclassing
assessing highlighting
bookmarking identifying
classifying linking
commenting moderating
describing questioning
editing replying
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Avoided Challenges
Specific Resource is an inline, opinionated Proxy
Ordering not important (for 95% of the use cases)
Graph boundaries (relatively) easy
String metadata was avoided by having an explicit Textual
Body resource, not just the string itself.
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Annotation API – Opinionated LDP
• Submit / Return all of annotation data, not per subject URI
• Use ActivityStreams paging mechanism
• Allow just URI reference
• Or full representation of Anno
• Use JSON-LD!
Still didn’t solve authentication!
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Annotations as Structural MD?
Annotations as comments aren’t really structural
But Annotation Model is flexible – an overlay on the web,
including the web of data. Could create new relationships,
and challenge was avoiding reinventing RDF in RDF or
reification.
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
IIIF – Presentation API
Annotations are a fundamental part of IIIF Presentation API
Provide only the information
necessary for an application to
present the object to the user
*
* Removed in 3.0
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
IIIF – Canvas and Content
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
IIIF Presentation API 3.0
Published June 2020!
• Adds time dimension to the canvas for A/V material
• Simplifies structure by removing Sequence
• Uses Web Annotation model
• Uses JSON-LD 1.1
• Aim: Developer happiness
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges?
Graph Boundary:
• Collections cannot embed collections (cf ORE)
• Manifests embed all structural components (cf PCDM)
• Text (via Annotations) are not embedded in Manifest
• Image/Video is embedded for UX
• More explicit definitions of boundaries in 3.0
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Challenges?
Order:
• Just use rdf:List (an array in JSON-LD)
• Alternatives are unqueriable anyway, so might as well be
usable and simple!
API:
• For 10 years the community has discussed and decided
not to standardize on Create/Update/Delete operations
• This might be the right answer
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
IIIF Design Principles
1. Scope design through shared use cases
2. Design for international use
3. As simple as possible, but no simpler
4. Make easy things easy, complex things possible
5. Avoid dependency on specific technologies
6. Use REST / Don’t break the web
7. Separate concerns, keep APIs loosely coupled
8. Design for JSON-LD, using LOD principles
9. Follow existing standards, best practices
10. Define success, not failure (for extensibility)
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Linked Open Usable Data
⭐ right Abstraction for the audience
⭐ few Barriers to entry
⭐ Comprehensible by introspection
⭐ Documentation with working examples
⭐ few Exceptions, many consistent patterns
Five Stars of LOUD:
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
With thanks to Patrick Hochstenbach, @hochstenbach
Who is the Audience for LOD?
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
LOUD: Easy to Use … by Developers!
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Linked Art
Community developed Cultural Heritage descriptive metadata
profile, focused on (art) museum use cases and applications.
Progressive Enhancement:
1. Legacy Data – No things, just description
2. Data for Humans – Things, but only with descriptions
3. Data for Machines – Linked, Structured Data
4. Data for Research – Accurate data in sufficient quantity
to answer research questions when aggregated
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Linked Art – Structural Data?
Partitioning & Membership are patterns used throughout:
• Parts of objects (frame is part of painting)
• Parts of places (New Haven is part of CT)
• Parts of events (actor’s particular role in larger event)
• Parts of texts (chapter is part of book)
• Parts of concepts (Watercolor is part of Painting concept)
• Membership in groups (Rob is a member of Yale staff,
painting is a member of auction lot set)
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Conclusions
• Open World vs Local Structure:
• Order of entities: Just use rdf:List, JSON-LD arrays
• Context-specific data: Needs a Ph.D. or two please!
• Usability:
• Graph Boundaries: Don’t Repeat Yourself, API as guide
• Representation vs Resource: Problem in theory only
• API Interactions: Get retrieval right, focus on usability
StructuralMetadata
@azaroth42
robert.
sanderson
@yale.edu
Thank You!
Discuss!

Structural Metadata in RDF (IS575)

Editor's Notes

  • #4 Okay, we’re not quite done…
  • #6 Why? Because structure is about localized relationships, whereas RDF or LOD has an open world assumption – together with more general issues, they make usability and adoption a challenge.
  • #10 ReM – the file that describes the aggregation. Each ReM can describe exactly one aggregation. Agg – Set of resources, either digital or conceptual Agg’d Resource – Any resource with a URI
  • #11 Aggregations can be aggregated, and as the only way to get to them is via their resourcemap, we can add a reference to it Now we have a recursive structure not just a flat list, but mandated to be in separate representations
  • #12 Resources (including aggregations) can be aggregated by many aggregations, which can be added, along with their resource map.
  • #13 History has shown that while the ontology is concise … it’s not all that basic!
  • #14 ORE takes a firm position on the boundary of the graph and how you can retrieve the set of relationships that make up the graph. The aggregated aggregation in the first resource map cannot include its aggregated resources, they can only be in that aggregations resource map (resource map 2) Aggregated resources could point to other aggregations, but aggregations could not. Aggregated aggregations could not point to their aggregated resources but could point to their resource map. Retrieval was also forced by this decision – you requested the ResourceMap by its URI and got the triples that fit within the boundary. No official position about Create, Update and Delete.
  • #15 Introduced the notion of a Proxy – a resource that stood for an aggregated resource in the context of the aggregation. Assertions about the proxy are about the resource, but are only valid in the context of the aggregation.
  • #16 This gives us a way to specify order, without globally asserting that that in all aggregations (or any context) the resource comes before or after another.
  • #17 And for non structural metadata as well, such as a title for the resource in the context of the aggregation.
  • #18 In a pure RDF worldview, there’s nothing theoretically wrong with Proxies. They’re a resource, and they can have relationships associated with them. However…
  • #19 As simple as possible … and then a bit more simple, but we’ll get to that.
  • #21 (Explain) Collection and Object are subclasses of ORE Aggregation, hasMember, hasFile and relatedObject are subProperties of aggregates. So this is exactly the same as ORE … just ignoring the resourcemap requirement. Opinionated: No ReMs. No descMD on Files. Distinction between a collection and an “object” (never very clear boundaries)
  • #22  Not everything is ordered, so Proxies (often called Poxies during PCDM implementation) are a sensible choice … theoretically. Mea culpa. Some further local constraints: Files cannot be ordered, nor related objects, only actual members of collections or objects.
  • #23  During PCDM, we discussed the Schema structure as alternative where ListItems can be the object of itemListElement (as well as resources generally) in order to assert order. Didn’t want to /require/ ListItems and didn’t want to have both resources and listitems as object of the property (schema.org is very loose!)
  • #28 But there was a long history before that.
  • #30 Motivation, TextualBody, three core components.
  • #31 Note Selector + Target = Specific Resource. Similar to ListItem / Proxy…
  • #34 Protocol – Also LDP, but opinionated for usability, not theoretical correctness.
  • #49 LOUD is the application of those design principles to LOD. We can summarize the five stars of LOUD as…
  • #50 Another way to think about it is … who is the audience for linked data?
  • #53 Try to learn from success of Usable data, and apply it in a more challenging environment than IIIF. Need to deal with all aspects of metadata, including especially structural.