• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Thinking Beyond Our Collections
 

Thinking Beyond Our Collections

on

  • 1,190 views

As we begin modeling and migrating our data to work in a Linked Data environment, we need to avoid simply building new silos with a trendy new facade. It is important to think carefully about how our ...

As we begin modeling and migrating our data to work in a Linked Data environment, we need to avoid simply building new silos with a trendy new facade. It is important to think carefully about how our data models fit into the larger cloud of data. We must consider what is necessary for us to link to and reuse other data sources and for others to reuse ours. How do we balance the control we want over our own vocabularies and models while also not alienating ourselves from the larger web? What compromises do we need to make? What effect will schema.org have? After a short introduction to RDF and the concepts of Linked Data, we will explore some potential snags and solutions as well as datasets and technologies that might influence some of our decisions.

Statistics

Views

Total Views
1,190
Views on SlideShare
1,190
Embed Views
0

Actions

Likes
3
Downloads
28
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Thinking Beyond Our Collections Thinking Beyond Our Collections Presentation Transcript

  • Thinking beyond our collections Making our models linked and linkableALA Midwinter 2012 Ross Singer
  • First things first a little background
  • What is “Linked Data”?
  • Tim Berners-Lee http://www.w3.org/People/Berners-Lee/card#i http://id.loc.gov/authorities/names/no99010609 http://viaf.org/viaf/85312226 http://dbpedia.org/resource/Tim_Berners-Lee
  • Rules of Linked Data1. Use URIs as names for things2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
  • http://www.opte.org/maps/
  • With data instead of documents
  • http://richard.cyganiak.de/2007/10/lod/
  • What is RDF?
  • RDF• Resource Description Framework• Data model, not a serialization• Based on triples
  • RDF Statements (Triples) Subject - Predicate - Object
  • title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
  • title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
  • #4 “has ISBN” “0838903258”
  • Subject Predicate Object#4 “has ISBN” “0838903258”
  • 1. Use URIs as names for things
  • #4
  • http://example.org/4
  • <http://example.org/4> “has ISBN” “0838903258”
  • We use URIs forpredicates, too
  • “has ISBN”
  • http://purl.org/ontology/bibo/isbn10
  • Subject Predicatehttp://example.org/4 http://purl.org/ontology/bibo/isbn10 “0838903258” Object
  • Objects• Can be literals • text • numeric • date • language• URIs • relate to other resources
  • title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
  • Subject Predicatehttp://example.org/4 http://purl.org/dc/terms/creator http://viaf.org/viaf/108143046 Object
  • et voila, Linked Data
  • Vocabularies Dublin Core general bibliographic descriptionFriend-of-a-Friend (FOAF) describe people and organizations Bibliontology (BIBO) citations and bibliographies SKOS subjects and thesauri WGS84 geographic coordinatesCreative Commons (CC) licenses and attribution recordings, performances, performers, Music Ontology (MO) etc. OWL used to build schemas
  • owl:sameAsThe nuclear option of the semantic web
  • Graph
  • title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
  • @prefix dcterms: <http://purl.org/dc/terms/> .@prefix bibo: <http://purl.org/ontology/bibo/> .<http://example.org/4> dcterms:title “The concise AACR2” ; dcterms:creator <http://viaf.org/viaf/108143046> ; bibo:isbn10 “0838903258” ; dcterms:language <http://purl.org/NET/marccodes/ languages/eng#lang> .
  • Why RDF?
  • Versatile• “Schemaless”• Properties can be assigned from any number of vocabularies• Description can be both generalized as well as domain or audience specific
  • Unambiguous description of things
  • http://example.org/4 http://purl.org/ontology/bibo/isbn10 “0838903258”
  • Unambiguous relationships between things
  • http://example.org/4 http://purl.org/dc/terms/creator http://viaf.org/viaf/108143046
  • http://example.org/4 http://rdvocab.info/roles/author http://viaf.org/viaf/108143046
  • Decentralized
  • Decentralized• No notion of “record”• Describe your things • Describe other people’s things (with their URIs)• “Open world assumption”
  • Reasoning*
  • RDF brings challenges
  • Logic prevails
  • Entailments
  • Domain/Range
  • Schemas/Vocabularies• Classes • “kinds of things” • foaf:Person • bibo:Book• Properties• Constraints
  • No “validation” of data
  • No provenance of data
  • No clear way toaddress conflicting data
  • Alignment• If you can’t link to other things, what’s the point?• What are you describing? • A “Book” or a “Manifestation”?• Who is your audience?• Who do you wish to consume from?
  • Case Study 1 IFLA FRBRer
  • Work ExpressionManifestation Item All WEMI entities are disjointed
  • Work Expression Manifestation ItemNo shortcuts between non-adjacent entities
  • No shortcuts betweennon-adjacent entities• Manifestations must have an Expression to relate to a Work• Lots of (possibly sketchy) scaffolding required• Who outside of libraries will do this?
  • FRBRWork Expression Manifestation Title Language ISBN “type” ofAuthor copyright date resource place ofSubject publication
  • bibo:Book Title Author Subject “type” Language ISBN copyright dateplace of publication
  • Work Expression Manifestation Title Language ISBNAuthor “type” of resource copyright dateSubject place of publication bibo:Book Title Author Subject “type” Language ISBN copyright date place of publication
  • How do we relate?• Bibliontology• Dublin Core’s “BibliographicResource”• http://schema.org/Book• etc.
  • Case Study 2 SKOS Concepts
  • SKOS• Simple Knowledge Organization System• Used for building thesauri• “Subject headings”
  • Do “subjects”represent the “thing”?
  • Buzz Aldrinhttp://upload.wikimedia.org/wikipedia/commons/d/da/Aldrin.jpg
  • Buzz Aldrinhttp://id.loc.gov/authorities/names/n88245653.html
  • @prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdaEnt: <http://rdvocab.info/uri/schema/FRBRentitiesRDA/>.@prefix owl: <http://www.w3.org/2002/07/owl#> .<http://viaf.org/viaf/sourceID/LC%7Cn+88245653#skos:Concept> a skos:Concept ; skos:exactMatch <http://id.loc.gov/authorities/names/ n88245653> ; foaf:focus <http://viaf.org/viaf/110368892> .<http://viaf.org/viaf/110368892> a foaf:Person, rdaEnt:Person ; owl:sameAs <http://dbpedia.org/resource/Buzz_Aldrin>, <http:// d-nb.info/gnd/107714566> . http://viaf.org/viaf/110368892/rdf.xml
  • • MARC 6XX = SKOS Concept (or MADS Authority)• MARC 1XX = DC Agent, FOAF Agent, RDA Agent, etc.
  • id.loc.gov• Everything is a SKOS Concept (or MADS Authority, which entails the same meaning) • Languages • Countries • etc.
  • purl.org/NET/marccodes• Unofficial modeling of: • Languages • Countries • GACs • Instruments/Voices • Audiences • Form of Items • Form of Musical Composition Full disclosure: I maintain this
  • purl.org/NET/marccodes• Models the “things” • Languages (http://www.lingvoj.org/ ontology#Lingvo) • Countries (http://purl.org/dc/terms/ Location) • etc.• Links to dbpedia, geonames, Lexvo/Lingvoj, id.loc.gov
  • Not clear which campwill gain mainstream acceptance
  • Datasets to consider modeling around
  • The “hub” of the semantic web
  • http://richard.cyganiak.de/2007/10/lod/
  • DBpedia• Data very messy • http://purl.org/NET/marccodes/ muscomp/sn• Data not as important as the identifiers
  • Geonames.org• Geographic and administrative data• 8 million+ resources described• Places of interest• “near” data
  • Musicbrainz• One of the more comprehensive open music databases• Many copies, which to use? • BBC Music • DBTune • zitgist • dataincubator• Modeled in Music Ontology
  • New York Times• People• Organizations• Places• All SKOS Concepts! • Conflated with the “thing”
  • Open Library• Works• Editions (sort of like Manifestations) • not entirely compatible: creator and language properties• Authors• Subjects
  • Bibliontology• Interested in modeling the citation, not the relationships within the Endeavor• Extremely easy to model an article, book or journal• Currently incompatible with FRBR
  • schema.org• 900 lb. SEO gorilla • Google, Bing,Yahoo!• HTML5 microdata • http://schema.org/Book • http://schema.org/Article • etc.• Dublin Core working on alignment
  • Breaking free from our silos• Linked data gives us potential to integrate into the larger web • reuse of our data = relevance! • reuse of other’s data
  • Important we don’t exclude ourselves by insisting on incompatible models!
  • Thank you! Ross Singer ross.singer@talis.com http://twitter.com/rsingerhttp://dilettantes.code4lib.org/blog