SlideShare a Scribd company logo
1 of 55
Download to read offline
RDF, RDA,
                          and other TLAs
                                     Dorothea Salo




Monday, January 2, 2012
Captatio benevolentiae
                •I am not a cataloger.
                          •Not even working as a librarian these days!

                •I am not a developer, either.
                •(I am doing a bit of standards work. Not
                 in this area, though.)
                •What I am? An educator and sometime
                 tech translator. I hope that’s enough.


Monday, January 2, 2012
We built MARC when




  stood between us and patron.
                               Photo: Deborah Fitchett, “Catalogue cards” http://www.flickr.com/photos/deborahfitchett/2970373235/ CC-BY
Monday, January 2, 2012
We built MARC when




       the world was clearly bounded.
                              Photo: NASA Goddard Photo and Video, “NASA Blue Marble” http://www.flickr.com/photos/gsfc/4392965590/ CC-BY
Monday, January 2, 2012
These days,




stands between us and patron.
                             Photo: Declan Jewell, “My Desk” http://www.flickr.com/photos/declanjewell/2743737312 CC-BY
Monday, January 2, 2012
These days,




       world’s looking a bit fractal!
                          Photo: NASA Goddard Photo and Video, “Still centered over the Atlantic” http://www.flickr.com/photos/gsfc/4409800816/ CC-BY
Monday, January 2, 2012
Review:

                •Where are the less-than-perfect fits
                 between library practice and the current
                 information landscape?
                •What does this mean for library systems
                 of information organization?



Monday, January 2, 2012
Problems with MARC/AACR2/ISBD
                                      (if you’re a networked computer)


      •Globally-unique identifiers for what’s in our
       bibliographic universe?
              •And what IS in our bibliographic universe, anyway?

      •Interoperability? Who speaks MARC outside
       libraries?
              •This is a problem on both ends of the pipeline, these days!

      •FREE TEXT                 (for anything not transcribed)   MUST DIE.
              •It is the LEAST consistent, internationalizable, interoperable way to record
               information on a computer.
              •Put another way: we haven’t controlled all the cataloging practices we usefully could.
               http://robotlibrarian.billdueber.com/isbn-parenthetical-notes-bad-marc-data-1/
Monday, January 2, 2012
Practical implications
    •Designing standards and practices around what
     computers do well, and what they need in order
     to do what they do.
    •Designing for being PART of the data universe,
     not all of it.
            •“open world assumption:” no one body has all the data! or all the answers!
            •And nobody can impose their view of the world on everybody else. (Fortunately,
             nobody necessarily has to.)
    •Designing for consistency, flexibility and
     extensibility without sacrificing comprehensibility
            •(this is a tall order; we’re not there yet. is anyone?)
Monday, January 2, 2012
... vocabulary note
             •“Semantic Web:” Tim Berners-Lee
              disappearing into his own navel.
                     •Term is a bit out-of-favor these days.

             •“Linked data:” a real-world effort to make
              large datastores more interoperable
             •RDF: invented by the SemWebbers, now a
              cornerstone for linked data
                     •Does this mean that all data will be stored as RDF? NO, IT DOES NOT (and
                      you have my permission to slap anybody who says it will).
                     •Totally possible to provide an RDF view onto non-RDF data, IF AND ONLY IF
                      the data structure and meaning are thought through in an RDFfy way.
Monday, January 2, 2012
Pragmatically: the five
                           stars of linked data
                               (Tim Berners-Lee)




Monday, January 2, 2012
Linked Data principles
                                           http://www.w3.org/DesignIssues/LinkedData.html

                •use URIs as names for things
                •use HTTP URIs so that people can look
                 up those things
                          •(this is one of Linked Data’s concessions to pragmatism, compared to the
                           original SemWebbers)

                •when someone looks up a URI, provide
                 useful information, using the standards
                •include links to other URIs so that they
                 can discover more things
Monday, January 2, 2012
Things computers like
          •Unique identifiers
                 •for anything you plan to discuss or refer to
                 •that NEVER CHANGE OR DISAPPEAR. (Sorry, name-authority strings.)
                 •How do we do this given the open-world assumption?
          •Consistent, predictable, human-language-
           independent data
                 •Free text (including punctuation) makes computers sad. They aren’t human.
                  They don’t understand it. They can be cued to PRODUCE it, but only based on
                  rules they’re given about the underlying data.
                 •Computers produce typography and layout, but don’t understand those, either.
          •Controlled vocabularies
                 •(If they’re well-provisioned with identifiers; see above.)
Monday, January 2, 2012
Globally unique identifiers
          •Astonishingly, we already have a relatively
           easy way to do this. The Web is an infinitely
           extensible information space: all the
           globally-unique identifiers we can dream up!
          •Term of art: “URI.”
                 •In practice, 99 times out of 100 this will be a plain old ordinary URL.
                 •The 100th time, it’ll mostly look like a URL, just with a different prefix.

          •EVERYTHING in linked-data-land revolves
           around URIs. They’re plumbing.
                 •And like plumbing, we usually don’t have to look at them. Just know that they’re
                  there.
Monday, January 2, 2012
URI wins
                •Internationalization
                          •We can present http://viaf.org/viaf/99258155/ as “Tchaikovsky, Peter
                           Ilich, 1840-1893.” A Russian library can present the same URI as
                           “Чайковский, Петр Ильич, 1840-1893.”
                          •Both libraries can exchange information about Tchaikovsky and his works
                           (e.g. holdings) without language barriers due to the URI intermediary.

                •Interoperability
                          •Websites with Tchaikovsky information? Finding aids? Metadata for
                           digitized images? Can all use this URI to refer to Tchaikovsky. This makes
                           it painless for computers to aggregate Tchaikovsky-related information,
                           with minimal if any human intervention!

Monday, January 2, 2012
What to do with URIs
        •RDF’s answer: “We say things about stuff.”
               •At base, RDF really is that simple!

        •Base unit of RDF: “triple”
               •Subject, property, value/object. Much like subject-verb-object in English sentence.
               •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’”




Monday, January 2, 2012
What to do with URIs
        •RDF’s answer: “We say things about stuff.”
               •At base, RDF really is that simple!

        •Base unit of RDF: “triple”
               •Subject, property, value/object. Much like subject-verb-object in English sentence.
               •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’”


                                          isAuthorOf
                                                                             “Innkeeper at the
            Dorothea Salo
                                                                               Roach Motel”




Monday, January 2, 2012
What to do with URIs
        •RDF’s answer: “We say things about stuff.”
               •At base, RDF really is that simple!

        •Base unit of RDF: “triple”
               •Subject, property, value/object. Much like subject-verb-object in English sentence.
               •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’”


                                          isAuthorOf
                                                                             “Innkeeper at the
            Dorothea Salo
                                                                               Roach Motel”


                 ... wait. Where’d all the URIs go?
Monday, January 2, 2012
A pause: just URIs?

                •Not strictly, according to RDF.
                          •“Literals,” that is, text strings, are also OK as objects. (Don’t tell
                           catalogers this!) But they’re STRONGLY discouraged.
                          •“Blank nodes” can also happen -- usually when a triple wants to use an
                           entire RDF statement as object. In lieu of giving the entire statement its
                           own URI, you get a “blank node” in the graph. Which is ugly, but so it
                           goes.




Monday, January 2, 2012
URI-izing a triple



                               isAuthorOf   “Innkeeper at the
            Dorothea Salo
                                              Roach Motel”




Monday, January 2, 2012
URI-izing a triple



         http://viaf.org/viaf/   isAuthorOf   “Innkeeper at the
             21599115/                          Roach Motel”




Monday, January 2, 2012
URI-izing a triple



                                 isAuthorOf              http://
         http://viaf.org/viaf/
                                              digital.library.wisc.edu/
             21599115/
                                                    1793/22088




Monday, January 2, 2012
URI-izing a triple



                                       isAuthorOf                  http://
         http://viaf.org/viaf/
                                                        digital.library.wisc.edu/
             21599115/
                                                              1793/22088



                                 vocabularies! with URIs!


Monday, January 2, 2012
URI-izing a triple



                                 isAuthorOf              http://
         http://viaf.org/viaf/
                                              digital.library.wisc.edu/
             21599115/
                                                    1793/22088




Monday, January 2, 2012
URI-izing a triple



                                 dcterms:creator              http://
         http://viaf.org/viaf/
                                                   digital.library.wisc.edu/
             21599115/
                                                         1793/22088




Monday, January 2, 2012
... wait, Dublin Core has URIs?




                          Yep.
Monday, January 2, 2012
MODS, too.




                          Hey, look, URIs!
                                     (this is new in MODS version 3.4)
Monday, January 2, 2012
MODS, too.




                          Hey, look, URIs!
                                     (this is new in MODS version 3.4)
Monday, January 2, 2012
(you should be able to read
                   these diagrams now)




                             Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
Monday, January 2, 2012
(even these)




                               Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions”
Monday, January 2, 2012
But... but...
                •What if the same thing has two URIs?
                          •Foreseen problem! There are ways for linked data to express URI
                           equivalences... though there are huge arguments about when two URIs
                           are really-truly equivalent.
                          •My sense is that this decision is contextual. (AKA: “will Amazon.com use
                           FRBR?”) What’s equivalent for your purposes may not be for mine. And
                           that’s okay!

                •Where do we get URIs from?
                          •This will be part of the new cataloging infrastructure a-borning, but the
                           answer works out to “a lot of the same places we already get authority
                           information and catalog records from,” e.g. VIAF.
                          •But we’re no longer LIMITED to just those! Key point. Think about ORCID!
Monday, January 2, 2012
Monday, January 2, 2012
Monday, January 2, 2012
But... but...
               •Where’s the record? And standards for
                the record?
                      •The record is what you make it! There’ll be metric tons of data about
                       Tchaikovsky linking to (and thus reachable through) his URL. (Somebody’ll
                       make a list of his pet dogs’ names. Guaranteed. People are funny about
                       dogs.) What’s useful to you, you use. What isn’t, you ignore. That’s how
                       the open world works.
                      •If we need to impose rules on the data we’ll be putting out there (and we
                       probably do!), there are ways to do that. We just can’t expect to impose
                       those ways on anybody else. (Though we can put our rules out there for
                       others to follow, and we probably should!)

Monday, January 2, 2012
Trust: an unsolved problem

                •Review: what happened with <meta>
                 tags on the web?
                •Right. What’s to stop the same thing
                 happening in a linked-data environment?
                          •What’s to stop me from saying I’m Tchaikovsky?
                          •The SemWeb people handwaved this for a long time.
                          •For our purposes? We’ll pick and choose the vocabularies and domains we
                           trust, I expect, just as we already do.



Monday, January 2, 2012
RDF in XML
             •RDF has its own namespace, but no
              schema (it’s an openended universe!).
                     •Root element: <rdf:RDF>
             •Vocabulary in any other XML namespace
              can be shoehorned into RDF triples.
                     •But don’t fool yourself: RDF triples and graphs and standard XML vocabulary
                      hierarchies do NOT map cleanly or automatically to each other.
                     •So MARC/AACR2 is FAR from the only metadata expression that’s looking at
                      a retooling!
             •Typical triple expression in XML:
                     •<rdf:Description about=”{subject}”> <predicate /> <object />
                      </rdf:Description>
             •XML is NOT the only syntax for RDF.
Monday, January 2, 2012
Retooling tools: GRDDL,
                     SKOS, and OWL
                •Gleaning Resource Descriptions from
                 Dialects of Languages
                          •W3C standard for providing a transformation of an existing XML
                           vocabulary into an RDF expression.
                          •Once there’s a GRDDL transform, users of the vocabulary need change
                           (almost) nothing! Vocabulary instance + GRDDL transform = RDF!
                •Simple Knowledge Organization System
                          •RDF data model (plus URIs, of course) for commonly-used controlled-
                           vocabulary structures such as thesauri and subject-heading lists.
                •Web Ontology Language (yes, I know)
                          •SEMWEB NERDS ONLY. Ontologies are serious business.
Monday, January 2, 2012
So what’s this “RDA Vocabularies”
   work that Diane and Karen et al.
   are doing?

                          Assigning URIs to stuff in RDA, so
                          that systems expecting URI-linked
                                                  data get it.
                                                               Seriously.
                                       That’s what all the fuss is about.
Monday, January 2, 2012
RDFizing RDA
                •What does RDA actually talk about?
                          •FRBR model: Group 1, 2, and 3 entities
                          •(though Group 1 is still kind of squidgy, really, and some application
                           developers are questioning its usefulness)
                          •DCMI model (because life can NEVER be simple)
                          •Relationships among entities

                •What do we want to say about them?
                          •Are there existing ways to say these things that are good enough for our
                           purposes? Can we reuse them, or at least map to them?
                          •When there aren’t, how do we say what we need to in ways that are most
                           useful for the rest of the world?

                •Assigning URIs to it all
Monday, January 2, 2012
Model friction
            •FRBR: entity-relationship model
                   •... like relational databases, which is nice
                   •not entirely RDFish, which is not quite so nice and has caused head-scratching
                   •But head-scratching is normal in this space! Modeling is hard!
            •FRBR does give us some abstractions to
             model and assign URIs to.
                   •And IFLA was supposed to do that... but they haven’t.
                   •So the RDA folks have provisionally done it: FRBRoo.
                   •When IFLA gets back in the game, formal equivalences will be defined and
                    published between FRBRoo and whatever IFLA comes up with.
            •FRBR isn’t perfect. (Gasp. I know, right?)
                   •So sticking strictly to FRBR as we model (relationships particularly) causes
                    problems for music and multimedia catalogers, among others.
Monday, January 2, 2012
RDA properties
     •Expressed (URLized) without reference to FRBR.
             •This is also the variant the linked-data web will generally see and use.
             •Which makes a certain amount of sense, because it’s quite possible to understand a lot
              of bibliographic data intuitively without reference to FRBR.
             •And we’ll never get the whole world to agree on FRBR; we can’t even agree ourselves!

     •Given “subproperties” which are the same
      thing, only FRBRized (and with their own URLs).
             •So the linked-data web sees a URL for “Book format.”
             •But we, because we are librarians and our systems understand us, understand that
              “Book format” is intrinsically tied up with a Manifestation.
             •This also covers us when an RDA property may apply to more than one FRBR entity,
              e.g. Extent: it’s the same property, but two subproperties!

Monday, January 2, 2012
Diagram: Hillmann et al., “RDA Vocabularies: Process, Outcome, Use” D-Lib Magazine. http://www.dlib.org/dlib/january10/hillmann/01hillmann.html
Monday, January 2, 2012
The ugliest case:




                     Diagram: Hillmann et al., “RDA Vocabularies: Process, Outcome, Use” D-Lib Magazine. http://www.dlib.org/dlib/january10/hillmann/01hillmann.html
Monday, January 2, 2012
... wait, where did
                             Dublin Core go?
             •Dublin Core, as we all know, is
              annoyingly vague.
             •Sadly, there’s an awful lot of DC data that
              we’ll have to map into this model.
                     •Ironic but true: librarians invented DC for the larger web, and then became
                      nearly the only people to use it extensively.

             •“Superproperties:” DC terms that map to
              several RDA properties. (E.g. “creator”)
                     •Probably the worst way to solve the problem... except for all the other ways.
Monday, January 2, 2012
Disentangling
                      aggregated statements
               •The last refuge of the text string!
                      •E.g. publication statements, which aggregate place, publisher name, and
                       date of publication.
                      •What if you only WANT one of those three bits of information? ARGH.

               •RDA doesn’t fix this. So RDA Vocabs is
                trying to.
                      •First, URLize each piece separately. Cool. Done. No problem.
                      •Then define a “Syntax Encoding Scheme” for the aggregate. Yuck.
                      •I have to tell you, this is a heinously ugly “fix.” Given legacy data, though,
                       hard to imagine better.
Monday, January 2, 2012
There’s more modeling
                pilpul.

                                          A lot of it.

                                       I’ll spare you.
                          Like I said, it’s plumbing.
Monday, January 2, 2012
What is actually happening?

                •We’re figuring out what we’re talking
                 about.
                •We’re figuring out what we want to say
                 about it.
                •We’re assigning URIs to all those things
                 (abstractions included!) so that we can
                 exchange information with the rest of the
                 web.

Monday, January 2, 2012
Summary by Diane Hillmann
                   http://managemetadata.org/blog/2011/09/08/fine-wine-and-old-fish/


       • Data should be able to be encoded in a variety of
         ways, to suit a variety of functions, uses, and systems.
       • Data should be managed at a granular, statement
         level, but also be available in a variety of record
         ‘formats.’ (with records being understood as primarily an on-the-fly method
            of aggregating data for a variety of downstream users)

       • Although current data is expressed mostly as text
         strings, data improvement strategies will be designed
         to change most of them to URIs as soon as
         practicable.
       • Data definitions and specifications will be easily
         available on the web, allowing mapping to be simpler
         and easier to tweak.
Monday, January 2, 2012
Library workflows

                •Given what you now know about RDF
                 and linked data, and your experiences
                 with cataloging, how do you think the
                 practice of cataloging will change in an
                 RDF-based environment?



Monday, January 2, 2012
SPARQL
                •With XML data, you generally just dump
                 it on the web and let people figure out
                 what (if anything) to do with it.
                          •This means a lot of translator-writing and bandwidth cost.
                          •(There’s an XML query language called XQuery, but nobody uses it.)
                          •You can do this with RDF too (and some do), but it’s not really ideal.

                •SPARQL: query language for RDF.
                          •Looks a LOT like SQL, intentionally so. The hardest thing to get to grips
                           with is namespace declarations, and that’s not really all that hard.
                          •“SPARQL endpoint:” URL for a given set of RDF data that you can send
                           queries to and get answers from.
                          •How does this change your answer about library workflows?
Monday, January 2, 2012
Why it’s worth doing



Monday, January 2, 2012
Your data ages like




                                 Photo: Matthew, “red wine bottle 1” http://www.flickr.com/photos/falcon1961/3408961521/ CC-BY
Monday, January 2, 2012
Your software
                          applications age like




                                 Photo: amanda mandy, “peixe pelo todo.” http://www.flickr.com/photos/polaina/3128038858/ CC-BY
Monday, January 2, 2012
... did that help?



Monday, January 2, 2012
I worked from...

             •Hillmann, Diane et al. “RDA Vocabularies:
              Process, Outcome, Use.” D-Lib Magazine
              16:1/2 (Jan/Feb 2010). http://
              www.dlib.org/dlib/january10/hillmann/
              01hillmann.html



Monday, January 2, 2012

More Related Content

Similar to RDF, RDA, and other TLAs

Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Intertwingularity, Semantic Web and linked Geo data
Intertwingularity, Semantic Web and linked Geo dataIntertwingularity, Semantic Web and linked Geo data
Intertwingularity, Semantic Web and linked Geo dataDan Brickley
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside StoryJames Hendler
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities Getaneh Alemu
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 
Digital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIDigital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIJames Hendler
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Richard Urban
 
Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Uche Ogbuji
 
Courage of our Connections
Courage of our ConnectionsCourage of our Connections
Courage of our ConnectionsRachel Frick
 
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)IWMW
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBlehresman
 
Open Sesame (and other open movements)
Open Sesame (and other open movements)Open Sesame (and other open movements)
Open Sesame (and other open movements)Dorothea Salo
 

Similar to RDF, RDA, and other TLAs (20)

Wither OWL
Wither OWLWither OWL
Wither OWL
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Schema and Identity for Linked Data
Schema and Identity for Linked DataSchema and Identity for Linked Data
Schema and Identity for Linked Data
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Intertwingularity, Semantic Web and linked Geo data
Intertwingularity, Semantic Web and linked Geo dataIntertwingularity, Semantic Web and linked Geo data
Intertwingularity, Semantic Web and linked Geo data
 
Semantic Web: The Inside Story
Semantic Web: The Inside StorySemantic Web: The Inside Story
Semantic Web: The Inside Story
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 
Digital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AIDigital Archiving, The Semantic Web, and Modern AI
Digital Archiving, The Semantic Web, and Modern AI
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
Ted Talk
Ted TalkTed Talk
Ted Talk
 
Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)Linked Data: The Real Web 2.0 (from 2008)
Linked Data: The Real Web 2.0 (from 2008)
 
Courage of our Connections
Courage of our ConnectionsCourage of our Connections
Courage of our Connections
 
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
 
Strengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDBStrengths and Weaknesses of MongoDB
Strengths and Weaknesses of MongoDB
 
When?
When?When?
When?
 
Open Sesame (and other open movements)
Open Sesame (and other open movements)Open Sesame (and other open movements)
Open Sesame (and other open movements)
 
Thinking of Linking
Thinking of LinkingThinking of Linking
Thinking of Linking
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 

More from Dorothea Salo

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Dorothea Salo
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Dorothea Salo
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and librariesDorothea Salo
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)Dorothea Salo
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Dorothea Salo
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly CommunicationDorothea Salo
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Dorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!Dorothea Salo
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!Dorothea Salo
 
Taming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsTaming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsDorothea Salo
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's WayDorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsDorothea Salo
 

More from Dorothea Salo (20)

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
 
Encryption
EncryptionEncryption
Encryption
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and libraries
 
Paying for it
Paying for itPaying for it
Paying for it
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?
 
FRBR and RDA
FRBR and RDAFRBR and RDA
FRBR and RDA
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
What We Organize
What We OrganizeWhat We Organize
What We Organize
 
Occupy Copyright!
Occupy Copyright!Occupy Copyright!
Occupy Copyright!
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!
 
Taming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation ToolsTaming the Monster: Digital Preservation Planning and Implementation Tools
Taming the Monster: Digital Preservation Planning and Implementation Tools
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
Open Content
Open ContentOpen Content
Open Content
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

RDF, RDA, and other TLAs

  • 1. RDF, RDA, and other TLAs Dorothea Salo Monday, January 2, 2012
  • 2. Captatio benevolentiae •I am not a cataloger. •Not even working as a librarian these days! •I am not a developer, either. •(I am doing a bit of standards work. Not in this area, though.) •What I am? An educator and sometime tech translator. I hope that’s enough. Monday, January 2, 2012
  • 3. We built MARC when stood between us and patron. Photo: Deborah Fitchett, “Catalogue cards” http://www.flickr.com/photos/deborahfitchett/2970373235/ CC-BY Monday, January 2, 2012
  • 4. We built MARC when the world was clearly bounded. Photo: NASA Goddard Photo and Video, “NASA Blue Marble” http://www.flickr.com/photos/gsfc/4392965590/ CC-BY Monday, January 2, 2012
  • 5. These days, stands between us and patron. Photo: Declan Jewell, “My Desk” http://www.flickr.com/photos/declanjewell/2743737312 CC-BY Monday, January 2, 2012
  • 6. These days, world’s looking a bit fractal! Photo: NASA Goddard Photo and Video, “Still centered over the Atlantic” http://www.flickr.com/photos/gsfc/4409800816/ CC-BY Monday, January 2, 2012
  • 7. Review: •Where are the less-than-perfect fits between library practice and the current information landscape? •What does this mean for library systems of information organization? Monday, January 2, 2012
  • 8. Problems with MARC/AACR2/ISBD (if you’re a networked computer) •Globally-unique identifiers for what’s in our bibliographic universe? •And what IS in our bibliographic universe, anyway? •Interoperability? Who speaks MARC outside libraries? •This is a problem on both ends of the pipeline, these days! •FREE TEXT (for anything not transcribed) MUST DIE. •It is the LEAST consistent, internationalizable, interoperable way to record information on a computer. •Put another way: we haven’t controlled all the cataloging practices we usefully could. http://robotlibrarian.billdueber.com/isbn-parenthetical-notes-bad-marc-data-1/ Monday, January 2, 2012
  • 9. Practical implications •Designing standards and practices around what computers do well, and what they need in order to do what they do. •Designing for being PART of the data universe, not all of it. •“open world assumption:” no one body has all the data! or all the answers! •And nobody can impose their view of the world on everybody else. (Fortunately, nobody necessarily has to.) •Designing for consistency, flexibility and extensibility without sacrificing comprehensibility •(this is a tall order; we’re not there yet. is anyone?) Monday, January 2, 2012
  • 10. ... vocabulary note •“Semantic Web:” Tim Berners-Lee disappearing into his own navel. •Term is a bit out-of-favor these days. •“Linked data:” a real-world effort to make large datastores more interoperable •RDF: invented by the SemWebbers, now a cornerstone for linked data •Does this mean that all data will be stored as RDF? NO, IT DOES NOT (and you have my permission to slap anybody who says it will). •Totally possible to provide an RDF view onto non-RDF data, IF AND ONLY IF the data structure and meaning are thought through in an RDFfy way. Monday, January 2, 2012
  • 11. Pragmatically: the five stars of linked data (Tim Berners-Lee) Monday, January 2, 2012
  • 12. Linked Data principles http://www.w3.org/DesignIssues/LinkedData.html •use URIs as names for things •use HTTP URIs so that people can look up those things •(this is one of Linked Data’s concessions to pragmatism, compared to the original SemWebbers) •when someone looks up a URI, provide useful information, using the standards •include links to other URIs so that they can discover more things Monday, January 2, 2012
  • 13. Things computers like •Unique identifiers •for anything you plan to discuss or refer to •that NEVER CHANGE OR DISAPPEAR. (Sorry, name-authority strings.) •How do we do this given the open-world assumption? •Consistent, predictable, human-language- independent data •Free text (including punctuation) makes computers sad. They aren’t human. They don’t understand it. They can be cued to PRODUCE it, but only based on rules they’re given about the underlying data. •Computers produce typography and layout, but don’t understand those, either. •Controlled vocabularies •(If they’re well-provisioned with identifiers; see above.) Monday, January 2, 2012
  • 14. Globally unique identifiers •Astonishingly, we already have a relatively easy way to do this. The Web is an infinitely extensible information space: all the globally-unique identifiers we can dream up! •Term of art: “URI.” •In practice, 99 times out of 100 this will be a plain old ordinary URL. •The 100th time, it’ll mostly look like a URL, just with a different prefix. •EVERYTHING in linked-data-land revolves around URIs. They’re plumbing. •And like plumbing, we usually don’t have to look at them. Just know that they’re there. Monday, January 2, 2012
  • 15. URI wins •Internationalization •We can present http://viaf.org/viaf/99258155/ as “Tchaikovsky, Peter Ilich, 1840-1893.” A Russian library can present the same URI as “Чайковский, Петр Ильич, 1840-1893.” •Both libraries can exchange information about Tchaikovsky and his works (e.g. holdings) without language barriers due to the URI intermediary. •Interoperability •Websites with Tchaikovsky information? Finding aids? Metadata for digitized images? Can all use this URI to refer to Tchaikovsky. This makes it painless for computers to aggregate Tchaikovsky-related information, with minimal if any human intervention! Monday, January 2, 2012
  • 16. What to do with URIs •RDF’s answer: “We say things about stuff.” •At base, RDF really is that simple! •Base unit of RDF: “triple” •Subject, property, value/object. Much like subject-verb-object in English sentence. •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’” Monday, January 2, 2012
  • 17. What to do with URIs •RDF’s answer: “We say things about stuff.” •At base, RDF really is that simple! •Base unit of RDF: “triple” •Subject, property, value/object. Much like subject-verb-object in English sentence. •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’” isAuthorOf “Innkeeper at the Dorothea Salo Roach Motel” Monday, January 2, 2012
  • 18. What to do with URIs •RDF’s answer: “We say things about stuff.” •At base, RDF really is that simple! •Base unit of RDF: “triple” •Subject, property, value/object. Much like subject-verb-object in English sentence. •Example: “Dorothea Salo is the author of ‘Innkeeper at the Roach Motel.’” isAuthorOf “Innkeeper at the Dorothea Salo Roach Motel” ... wait. Where’d all the URIs go? Monday, January 2, 2012
  • 19. A pause: just URIs? •Not strictly, according to RDF. •“Literals,” that is, text strings, are also OK as objects. (Don’t tell catalogers this!) But they’re STRONGLY discouraged. •“Blank nodes” can also happen -- usually when a triple wants to use an entire RDF statement as object. In lieu of giving the entire statement its own URI, you get a “blank node” in the graph. Which is ugly, but so it goes. Monday, January 2, 2012
  • 20. URI-izing a triple isAuthorOf “Innkeeper at the Dorothea Salo Roach Motel” Monday, January 2, 2012
  • 21. URI-izing a triple http://viaf.org/viaf/ isAuthorOf “Innkeeper at the 21599115/ Roach Motel” Monday, January 2, 2012
  • 22. URI-izing a triple isAuthorOf http:// http://viaf.org/viaf/ digital.library.wisc.edu/ 21599115/ 1793/22088 Monday, January 2, 2012
  • 23. URI-izing a triple isAuthorOf http:// http://viaf.org/viaf/ digital.library.wisc.edu/ 21599115/ 1793/22088 vocabularies! with URIs! Monday, January 2, 2012
  • 24. URI-izing a triple isAuthorOf http:// http://viaf.org/viaf/ digital.library.wisc.edu/ 21599115/ 1793/22088 Monday, January 2, 2012
  • 25. URI-izing a triple dcterms:creator http:// http://viaf.org/viaf/ digital.library.wisc.edu/ 21599115/ 1793/22088 Monday, January 2, 2012
  • 26. ... wait, Dublin Core has URIs? Yep. Monday, January 2, 2012
  • 27. MODS, too. Hey, look, URIs! (this is new in MODS version 3.4) Monday, January 2, 2012
  • 28. MODS, too. Hey, look, URIs! (this is new in MODS version 3.4) Monday, January 2, 2012
  • 29. (you should be able to read these diagrams now) Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions” Monday, January 2, 2012
  • 30. (even these) Diagram: Stephen J. Miller, “Teaching RDA after the National Implementation Decisions” Monday, January 2, 2012
  • 31. But... but... •What if the same thing has two URIs? •Foreseen problem! There are ways for linked data to express URI equivalences... though there are huge arguments about when two URIs are really-truly equivalent. •My sense is that this decision is contextual. (AKA: “will Amazon.com use FRBR?”) What’s equivalent for your purposes may not be for mine. And that’s okay! •Where do we get URIs from? •This will be part of the new cataloging infrastructure a-borning, but the answer works out to “a lot of the same places we already get authority information and catalog records from,” e.g. VIAF. •But we’re no longer LIMITED to just those! Key point. Think about ORCID! Monday, January 2, 2012
  • 34. But... but... •Where’s the record? And standards for the record? •The record is what you make it! There’ll be metric tons of data about Tchaikovsky linking to (and thus reachable through) his URL. (Somebody’ll make a list of his pet dogs’ names. Guaranteed. People are funny about dogs.) What’s useful to you, you use. What isn’t, you ignore. That’s how the open world works. •If we need to impose rules on the data we’ll be putting out there (and we probably do!), there are ways to do that. We just can’t expect to impose those ways on anybody else. (Though we can put our rules out there for others to follow, and we probably should!) Monday, January 2, 2012
  • 35. Trust: an unsolved problem •Review: what happened with <meta> tags on the web? •Right. What’s to stop the same thing happening in a linked-data environment? •What’s to stop me from saying I’m Tchaikovsky? •The SemWeb people handwaved this for a long time. •For our purposes? We’ll pick and choose the vocabularies and domains we trust, I expect, just as we already do. Monday, January 2, 2012
  • 36. RDF in XML •RDF has its own namespace, but no schema (it’s an openended universe!). •Root element: <rdf:RDF> •Vocabulary in any other XML namespace can be shoehorned into RDF triples. •But don’t fool yourself: RDF triples and graphs and standard XML vocabulary hierarchies do NOT map cleanly or automatically to each other. •So MARC/AACR2 is FAR from the only metadata expression that’s looking at a retooling! •Typical triple expression in XML: •<rdf:Description about=”{subject}”> <predicate /> <object /> </rdf:Description> •XML is NOT the only syntax for RDF. Monday, January 2, 2012
  • 37. Retooling tools: GRDDL, SKOS, and OWL •Gleaning Resource Descriptions from Dialects of Languages •W3C standard for providing a transformation of an existing XML vocabulary into an RDF expression. •Once there’s a GRDDL transform, users of the vocabulary need change (almost) nothing! Vocabulary instance + GRDDL transform = RDF! •Simple Knowledge Organization System •RDF data model (plus URIs, of course) for commonly-used controlled- vocabulary structures such as thesauri and subject-heading lists. •Web Ontology Language (yes, I know) •SEMWEB NERDS ONLY. Ontologies are serious business. Monday, January 2, 2012
  • 38. So what’s this “RDA Vocabularies” work that Diane and Karen et al. are doing? Assigning URIs to stuff in RDA, so that systems expecting URI-linked data get it. Seriously. That’s what all the fuss is about. Monday, January 2, 2012
  • 39. RDFizing RDA •What does RDA actually talk about? •FRBR model: Group 1, 2, and 3 entities •(though Group 1 is still kind of squidgy, really, and some application developers are questioning its usefulness) •DCMI model (because life can NEVER be simple) •Relationships among entities •What do we want to say about them? •Are there existing ways to say these things that are good enough for our purposes? Can we reuse them, or at least map to them? •When there aren’t, how do we say what we need to in ways that are most useful for the rest of the world? •Assigning URIs to it all Monday, January 2, 2012
  • 40. Model friction •FRBR: entity-relationship model •... like relational databases, which is nice •not entirely RDFish, which is not quite so nice and has caused head-scratching •But head-scratching is normal in this space! Modeling is hard! •FRBR does give us some abstractions to model and assign URIs to. •And IFLA was supposed to do that... but they haven’t. •So the RDA folks have provisionally done it: FRBRoo. •When IFLA gets back in the game, formal equivalences will be defined and published between FRBRoo and whatever IFLA comes up with. •FRBR isn’t perfect. (Gasp. I know, right?) •So sticking strictly to FRBR as we model (relationships particularly) causes problems for music and multimedia catalogers, among others. Monday, January 2, 2012
  • 41. RDA properties •Expressed (URLized) without reference to FRBR. •This is also the variant the linked-data web will generally see and use. •Which makes a certain amount of sense, because it’s quite possible to understand a lot of bibliographic data intuitively without reference to FRBR. •And we’ll never get the whole world to agree on FRBR; we can’t even agree ourselves! •Given “subproperties” which are the same thing, only FRBRized (and with their own URLs). •So the linked-data web sees a URL for “Book format.” •But we, because we are librarians and our systems understand us, understand that “Book format” is intrinsically tied up with a Manifestation. •This also covers us when an RDA property may apply to more than one FRBR entity, e.g. Extent: it’s the same property, but two subproperties! Monday, January 2, 2012
  • 42. Diagram: Hillmann et al., “RDA Vocabularies: Process, Outcome, Use” D-Lib Magazine. http://www.dlib.org/dlib/january10/hillmann/01hillmann.html Monday, January 2, 2012
  • 43. The ugliest case: Diagram: Hillmann et al., “RDA Vocabularies: Process, Outcome, Use” D-Lib Magazine. http://www.dlib.org/dlib/january10/hillmann/01hillmann.html Monday, January 2, 2012
  • 44. ... wait, where did Dublin Core go? •Dublin Core, as we all know, is annoyingly vague. •Sadly, there’s an awful lot of DC data that we’ll have to map into this model. •Ironic but true: librarians invented DC for the larger web, and then became nearly the only people to use it extensively. •“Superproperties:” DC terms that map to several RDA properties. (E.g. “creator”) •Probably the worst way to solve the problem... except for all the other ways. Monday, January 2, 2012
  • 45. Disentangling aggregated statements •The last refuge of the text string! •E.g. publication statements, which aggregate place, publisher name, and date of publication. •What if you only WANT one of those three bits of information? ARGH. •RDA doesn’t fix this. So RDA Vocabs is trying to. •First, URLize each piece separately. Cool. Done. No problem. •Then define a “Syntax Encoding Scheme” for the aggregate. Yuck. •I have to tell you, this is a heinously ugly “fix.” Given legacy data, though, hard to imagine better. Monday, January 2, 2012
  • 46. There’s more modeling pilpul. A lot of it. I’ll spare you. Like I said, it’s plumbing. Monday, January 2, 2012
  • 47. What is actually happening? •We’re figuring out what we’re talking about. •We’re figuring out what we want to say about it. •We’re assigning URIs to all those things (abstractions included!) so that we can exchange information with the rest of the web. Monday, January 2, 2012
  • 48. Summary by Diane Hillmann http://managemetadata.org/blog/2011/09/08/fine-wine-and-old-fish/ • Data should be able to be encoded in a variety of ways, to suit a variety of functions, uses, and systems. • Data should be managed at a granular, statement level, but also be available in a variety of record ‘formats.’ (with records being understood as primarily an on-the-fly method of aggregating data for a variety of downstream users) • Although current data is expressed mostly as text strings, data improvement strategies will be designed to change most of them to URIs as soon as practicable. • Data definitions and specifications will be easily available on the web, allowing mapping to be simpler and easier to tweak. Monday, January 2, 2012
  • 49. Library workflows •Given what you now know about RDF and linked data, and your experiences with cataloging, how do you think the practice of cataloging will change in an RDF-based environment? Monday, January 2, 2012
  • 50. SPARQL •With XML data, you generally just dump it on the web and let people figure out what (if anything) to do with it. •This means a lot of translator-writing and bandwidth cost. •(There’s an XML query language called XQuery, but nobody uses it.) •You can do this with RDF too (and some do), but it’s not really ideal. •SPARQL: query language for RDF. •Looks a LOT like SQL, intentionally so. The hardest thing to get to grips with is namespace declarations, and that’s not really all that hard. •“SPARQL endpoint:” URL for a given set of RDF data that you can send queries to and get answers from. •How does this change your answer about library workflows? Monday, January 2, 2012
  • 51. Why it’s worth doing Monday, January 2, 2012
  • 52. Your data ages like Photo: Matthew, “red wine bottle 1” http://www.flickr.com/photos/falcon1961/3408961521/ CC-BY Monday, January 2, 2012
  • 53. Your software applications age like Photo: amanda mandy, “peixe pelo todo.” http://www.flickr.com/photos/polaina/3128038858/ CC-BY Monday, January 2, 2012
  • 54. ... did that help? Monday, January 2, 2012
  • 55. I worked from... •Hillmann, Diane et al. “RDA Vocabularies: Process, Outcome, Use.” D-Lib Magazine 16:1/2 (Jan/Feb 2010). http:// www.dlib.org/dlib/january10/hillmann/ 01hillmann.html Monday, January 2, 2012