Thinking Beyond Our Collections

1,514 views

Published on

As we begin modeling and migrating our data to work in a Linked Data environment, we need to avoid simply building new silos with a trendy new facade. It is important to think carefully about how our data models fit into the larger cloud of data. We must consider what is necessary for us to link to and reuse other data sources and for others to reuse ours. How do we balance the control we want over our own vocabularies and models while also not alienating ourselves from the larger web? What compromises do we need to make? What effect will schema.org have? After a short introduction to RDF and the concepts of Linked Data, we will explore some potential snags and solutions as well as datasets and technologies that might influence some of our decisions.

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,514
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
37
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Thinking Beyond Our Collections

    1. 1. Thinking beyond our collections Making our models linked and linkableALA Midwinter 2012 Ross Singer
    2. 2. First things first a little background
    3. 3. What is “Linked Data”?
    4. 4. Tim Berners-Lee http://www.w3.org/People/Berners-Lee/card#i http://id.loc.gov/authorities/names/no99010609 http://viaf.org/viaf/85312226 http://dbpedia.org/resource/Tim_Berners-Lee
    5. 5. Rules of Linked Data1. Use URIs as names for things2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
    6. 6. http://www.opte.org/maps/
    7. 7. With data instead of documents
    8. 8. http://richard.cyganiak.de/2007/10/lod/
    9. 9. What is RDF?
    10. 10. RDF• Resource Description Framework• Data model, not a serialization• Based on triples
    11. 11. RDF Statements (Triples) Subject - Predicate - Object
    12. 12. title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
    13. 13. title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
    14. 14. #4 “has ISBN” “0838903258”
    15. 15. Subject Predicate Object#4 “has ISBN” “0838903258”
    16. 16. 1. Use URIs as names for things
    17. 17. #4
    18. 18. http://example.org/4
    19. 19. <http://example.org/4> “has ISBN” “0838903258”
    20. 20. We use URIs forpredicates, too
    21. 21. “has ISBN”
    22. 22. http://purl.org/ontology/bibo/isbn10
    23. 23. Subject Predicatehttp://example.org/4 http://purl.org/ontology/bibo/isbn10 “0838903258” Object
    24. 24. Objects• Can be literals • text • numeric • date • language• URIs • relate to other resources
    25. 25. title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
    26. 26. Subject Predicatehttp://example.org/4 http://purl.org/dc/terms/creator http://viaf.org/viaf/108143046 Object
    27. 27. et voila, Linked Data
    28. 28. Vocabularies Dublin Core general bibliographic descriptionFriend-of-a-Friend (FOAF) describe people and organizations Bibliontology (BIBO) citations and bibliographies SKOS subjects and thesauri WGS84 geographic coordinatesCreative Commons (CC) licenses and attribution recordings, performances, performers, Music Ontology (MO) etc. OWL used to build schemas
    29. 29. owl:sameAsThe nuclear option of the semantic web
    30. 30. Graph
    31. 31. title author isbn language Berners-Lee,1 Weaving the Web Tim 0062515861 eng Durrenmatt,2 Pour Vaclav Havel Friedrich 2882822444 fre García3 Cien años de soledad Márquez, 9500700298 spa Gabriel Gorman,4 The concise AACR2 Michael 0838903258 eng
    32. 32. @prefix dcterms: <http://purl.org/dc/terms/> .@prefix bibo: <http://purl.org/ontology/bibo/> .<http://example.org/4> dcterms:title “The concise AACR2” ; dcterms:creator <http://viaf.org/viaf/108143046> ; bibo:isbn10 “0838903258” ; dcterms:language <http://purl.org/NET/marccodes/ languages/eng#lang> .
    33. 33. Why RDF?
    34. 34. Versatile• “Schemaless”• Properties can be assigned from any number of vocabularies• Description can be both generalized as well as domain or audience specific
    35. 35. Unambiguous description of things
    36. 36. http://example.org/4 http://purl.org/ontology/bibo/isbn10 “0838903258”
    37. 37. Unambiguous relationships between things
    38. 38. http://example.org/4 http://purl.org/dc/terms/creator http://viaf.org/viaf/108143046
    39. 39. http://example.org/4 http://rdvocab.info/roles/author http://viaf.org/viaf/108143046
    40. 40. Decentralized
    41. 41. Decentralized• No notion of “record”• Describe your things • Describe other people’s things (with their URIs)• “Open world assumption”
    42. 42. Reasoning*
    43. 43. RDF brings challenges
    44. 44. Logic prevails
    45. 45. Entailments
    46. 46. Domain/Range
    47. 47. Schemas/Vocabularies• Classes • “kinds of things” • foaf:Person • bibo:Book• Properties• Constraints
    48. 48. No “validation” of data
    49. 49. No provenance of data
    50. 50. No clear way toaddress conflicting data
    51. 51. Alignment• If you can’t link to other things, what’s the point?• What are you describing? • A “Book” or a “Manifestation”?• Who is your audience?• Who do you wish to consume from?
    52. 52. Case Study 1 IFLA FRBRer
    53. 53. Work ExpressionManifestation Item All WEMI entities are disjointed
    54. 54. Work Expression Manifestation ItemNo shortcuts between non-adjacent entities
    55. 55. No shortcuts betweennon-adjacent entities• Manifestations must have an Expression to relate to a Work• Lots of (possibly sketchy) scaffolding required• Who outside of libraries will do this?
    56. 56. FRBRWork Expression Manifestation Title Language ISBN “type” ofAuthor copyright date resource place ofSubject publication
    57. 57. bibo:Book Title Author Subject “type” Language ISBN copyright dateplace of publication
    58. 58. Work Expression Manifestation Title Language ISBNAuthor “type” of resource copyright dateSubject place of publication bibo:Book Title Author Subject “type” Language ISBN copyright date place of publication
    59. 59. How do we relate?• Bibliontology• Dublin Core’s “BibliographicResource”• http://schema.org/Book• etc.
    60. 60. Case Study 2 SKOS Concepts
    61. 61. SKOS• Simple Knowledge Organization System• Used for building thesauri• “Subject headings”
    62. 62. Do “subjects”represent the “thing”?
    63. 63. Buzz Aldrinhttp://upload.wikimedia.org/wikipedia/commons/d/da/Aldrin.jpg
    64. 64. Buzz Aldrinhttp://id.loc.gov/authorities/names/n88245653.html
    65. 65. @prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdaEnt: <http://rdvocab.info/uri/schema/FRBRentitiesRDA/>.@prefix owl: <http://www.w3.org/2002/07/owl#> .<http://viaf.org/viaf/sourceID/LC%7Cn+88245653#skos:Concept> a skos:Concept ; skos:exactMatch <http://id.loc.gov/authorities/names/ n88245653> ; foaf:focus <http://viaf.org/viaf/110368892> .<http://viaf.org/viaf/110368892> a foaf:Person, rdaEnt:Person ; owl:sameAs <http://dbpedia.org/resource/Buzz_Aldrin>, <http:// d-nb.info/gnd/107714566> . http://viaf.org/viaf/110368892/rdf.xml
    66. 66. • MARC 6XX = SKOS Concept (or MADS Authority)• MARC 1XX = DC Agent, FOAF Agent, RDA Agent, etc.
    67. 67. id.loc.gov• Everything is a SKOS Concept (or MADS Authority, which entails the same meaning) • Languages • Countries • etc.
    68. 68. purl.org/NET/marccodes• Unofficial modeling of: • Languages • Countries • GACs • Instruments/Voices • Audiences • Form of Items • Form of Musical Composition Full disclosure: I maintain this
    69. 69. purl.org/NET/marccodes• Models the “things” • Languages (http://www.lingvoj.org/ ontology#Lingvo) • Countries (http://purl.org/dc/terms/ Location) • etc.• Links to dbpedia, geonames, Lexvo/Lingvoj, id.loc.gov
    70. 70. Not clear which campwill gain mainstream acceptance
    71. 71. Datasets to consider modeling around
    72. 72. The “hub” of the semantic web
    73. 73. http://richard.cyganiak.de/2007/10/lod/
    74. 74. DBpedia• Data very messy • http://purl.org/NET/marccodes/ muscomp/sn• Data not as important as the identifiers
    75. 75. Geonames.org• Geographic and administrative data• 8 million+ resources described• Places of interest• “near” data
    76. 76. Musicbrainz• One of the more comprehensive open music databases• Many copies, which to use? • BBC Music • DBTune • zitgist • dataincubator• Modeled in Music Ontology
    77. 77. New York Times• People• Organizations• Places• All SKOS Concepts! • Conflated with the “thing”
    78. 78. Open Library• Works• Editions (sort of like Manifestations) • not entirely compatible: creator and language properties• Authors• Subjects
    79. 79. Bibliontology• Interested in modeling the citation, not the relationships within the Endeavor• Extremely easy to model an article, book or journal• Currently incompatible with FRBR
    80. 80. schema.org• 900 lb. SEO gorilla • Google, Bing,Yahoo!• HTML5 microdata • http://schema.org/Book • http://schema.org/Article • etc.• Dublin Core working on alignment
    81. 81. Breaking free from our silos• Linked data gives us potential to integrate into the larger web • reuse of our data = relevance! • reuse of other’s data
    82. 82. Important we don’t exclude ourselves by insisting on incompatible models!
    83. 83. Thank you! Ross Singer ross.singer@talis.com http://twitter.com/rsingerhttp://dilettantes.code4lib.org/blog

    ×