Linked Data 101:
Getting Caught in the Semantic Web
Morgan Briles
NOTHING exists in a vacuum
http://davidlankes.org
Has
Creator
Syracuse
University
Jill Hurst-
Wahl
New York
What is Linked Data?
A set of best practices for publishing structured data on the web
1.Use URIs to name things;
2.Use HTTP URIs so that things can be referred to and looked up
("dereferenced") by people and user agents.
3.When someone looks up a URI, provide useful information, using the open
Web standards such as RDF, SPARQL;
4.Include links to other related things using their URIs when publishing on the
Web.
https://www.w3.org/TR/ld-glossary/#linked-data-principles
Organized in a Graph
RDF statements AKA Triples
Has
Creator
Subject
Predicate
Object
RDF statements AKA Triples
Has
Creator
Subject
Predicate
Object
http://viaf.org/viaf/78411319http://www.worldcat.org/oclc/894993874
http://purl.org/dc/elements/1.1/creator
Linked Data Vocabularies...there are others
FOAF Friend of a Friend Used for people and
relationships
Ex. knows, Person,
Group, Member,
firstName
DC Terms Dublin Core Metadata
Schema
Describing creative works Ex. creator, publisher,
subject,
Schema Schema.org Search engine developed
markup for webpages of
commonly used nouns to
improve web searching
Ex. Thing, Place, Book,
SKOS Simple Knowledge
Organization System
Specifies relationships
between entities
Ex. broader, prefLabel,
Concept, definition
RDFS Resource Description
Framework Schema
Describes relationships
that are a part of type
Ex. Class, resource,
property, type, domain
Examples of Triples
Jill Hurst Wahl foaf:knows R.David Lankes
Jill Hurst Wahl Is a
rdf:type
foaf:Person
Cat
skos:broader
mammal
The Atlas of New
Librarianship
rdf:type schema:Book
Serializations of Resource Description Framework
RDF/XML Earliest form of RDF, basically writing XML
with triples; messy
N-Triples Simplest, writing triples in one line
Turtle More human readable N-Triples,
Microdata and RDFa Useful for embedding information into HTML
webpages, optimizes SEO
JSON-LD Based on Javascript, object oriented
RDF Query Language is...
SPARQL!!!
https://flic.kr/p/5CEK6Y
Applications and Datasets
DBpedia
Geonames
Europeana
Data.gov.uk
Bio2RDF
Six Degrees of Francis Bacon
Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-
cloud.net/
What does this mean for libraries?
The end of
MARC...eventually
Getting library catalogs on
the web
LOC’s BIBFRAME initiative
If Linked Data is so great why isn’t
everything Linked Data?
A lot of data and datasets are
not available on the web
The hesitance to make data
open
Privacy concerns, less
anonymity on the web
Computers don’t have the same
capability to understand
meaning like people
Want to learn more? Check these sites out!
https://www.w3.org/standards/semanticweb/data
https://www.loc.gov/bibframe/
http://lod-cloud.net/versions/2014-08-30/lod-cloud.svg
http://lov.okfn.org/dataset/lov/
http://www.sixdegreesoffrancisbacon.com/
http://schema.org/docs/gs.htm
http://www.linkeddatatools.com/index.php
https://www.w3.org/DesignIssues/LinkedData.html
THANKS FOR COMING!
Questions?
Comments?
Email:
mwbriles@syr.ed
u

Linked data 101: Getting Caught in the Semantic Web

  • 1.
    Linked Data 101: GettingCaught in the Semantic Web Morgan Briles
  • 2.
    NOTHING exists ina vacuum http://davidlankes.org Has Creator
  • 3.
  • 4.
    What is LinkedData? A set of best practices for publishing structured data on the web 1.Use URIs to name things; 2.Use HTTP URIs so that things can be referred to and looked up ("dereferenced") by people and user agents. 3.When someone looks up a URI, provide useful information, using the open Web standards such as RDF, SPARQL; 4.Include links to other related things using their URIs when publishing on the Web. https://www.w3.org/TR/ld-glossary/#linked-data-principles
  • 5.
  • 6.
    RDF statements AKATriples Has Creator Subject Predicate Object
  • 7.
    RDF statements AKATriples Has Creator Subject Predicate Object http://viaf.org/viaf/78411319http://www.worldcat.org/oclc/894993874 http://purl.org/dc/elements/1.1/creator
  • 8.
    Linked Data Vocabularies...thereare others FOAF Friend of a Friend Used for people and relationships Ex. knows, Person, Group, Member, firstName DC Terms Dublin Core Metadata Schema Describing creative works Ex. creator, publisher, subject, Schema Schema.org Search engine developed markup for webpages of commonly used nouns to improve web searching Ex. Thing, Place, Book, SKOS Simple Knowledge Organization System Specifies relationships between entities Ex. broader, prefLabel, Concept, definition RDFS Resource Description Framework Schema Describes relationships that are a part of type Ex. Class, resource, property, type, domain
  • 9.
    Examples of Triples JillHurst Wahl foaf:knows R.David Lankes Jill Hurst Wahl Is a rdf:type foaf:Person Cat skos:broader mammal The Atlas of New Librarianship rdf:type schema:Book
  • 10.
    Serializations of ResourceDescription Framework RDF/XML Earliest form of RDF, basically writing XML with triples; messy N-Triples Simplest, writing triples in one line Turtle More human readable N-Triples, Microdata and RDFa Useful for embedding information into HTML webpages, optimizes SEO JSON-LD Based on Javascript, object oriented
  • 11.
    RDF Query Languageis... SPARQL!!! https://flic.kr/p/5CEK6Y
  • 12.
  • 13.
    Linking Open Datacloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod- cloud.net/
  • 14.
    What does thismean for libraries? The end of MARC...eventually Getting library catalogs on the web LOC’s BIBFRAME initiative
  • 15.
    If Linked Datais so great why isn’t everything Linked Data? A lot of data and datasets are not available on the web The hesitance to make data open Privacy concerns, less anonymity on the web Computers don’t have the same capability to understand meaning like people
  • 16.
    Want to learnmore? Check these sites out! https://www.w3.org/standards/semanticweb/data https://www.loc.gov/bibframe/ http://lod-cloud.net/versions/2014-08-30/lod-cloud.svg http://lov.okfn.org/dataset/lov/ http://www.sixdegreesoffrancisbacon.com/ http://schema.org/docs/gs.htm http://www.linkeddatatools.com/index.php https://www.w3.org/DesignIssues/LinkedData.html
  • 17.

Editor's Notes

  • #2 Hi guys, I’m Morgan Briles, a 2nd year LIS student, and I’m going to be talking to you about my favorite topic of librarianship, and that is linked data Linked Data and the Semantic Web are two terms that you often hear together. The goal of the semantic web is to make data that means something to humans, machine readable. The semantic web is a product of applying Linked Data principles to the world wide web.This is how I understand it, however some people might define these two terms slightly differently. Also the term “semantic web” in some circles rings of unpragmatic idealism, so it has fallen out of favor for some. Just FYI
  • #3  A thing pretty much always has one or more relationships with other things in the world This includes our favorite book: The Atlas of New Librarianship is a thing in the world, and because it’s a thing in the world it is related to other things in the world, like R.David Lankes The Atlas of New Librarianship has a creator R.David Lankes What if we zoom out a little bit
  • #4 Dave Lankes is also a thing in the world and he has relationships to other things in the world Like Dave Lankes works at Syracuse University, (for now at least) Jill Hurst-Wahl also works for Syracuse University Dave Lankes knows Jill Hurst-Wahl Syracuse University is in New York Jill Hurst-Wahl is a creator of “The Information and Knowledge Professional’s Career Handbook Aa you can see, there is a whole web of relationships that can be defined from the LIS program at the iSchool Have you ever been on a wikipedia tangent, where you’re reading an article and you’re curious about something, and there’s a hyperlink, and you click it to read about that article, which then leads you to something else? Well that’s essentially what linked data and the semantic web are all about.
  • #5 Set of principles laid out by Tim Berners-Lee, who hopefully you remember created the world wide web by inventing the hyperlink Here are his four rules for making linked open data The goal of linked data and the semantic web is for machines to be able to understand things that people can URI is a unique resource identifier, the most common one that we use everyday is a URL, ideally this is also something permanent, like a permalink. Like 99.999 percent of URIs are URLs then: make it live online, and when they try to access the URI people can discover what it means. And ideally a URI should have links to other URIs to that people can follow the trail, and in an ideal world follow relationships across the web
  • #6 This kind of data are organized in a graph I’m not talking about an x-y axis that we know from math, but a conceptual graph This is different from the two other data organization structures we may be familiar with: hierarchical and relational Hierarchical structure doesn’t work here, in our example, Jill can be placed under Syracuse University, because in another relationship she is the primary agent, like for her book Relational models with tables also fall short since there is no primary key So it works just like a graph that we made in the last slide. So if you can understand the diagram I just showed you, then you can understand linked data
  • #7 The way that linked data is written is in a syntax called Resource Description Framework, which is often called RDF for short, and it’s written in the form of what’s called a triple Each triple has a subject, predicate, and object So subject is the Atlas of new librarianship The predicate is has creator And the object is R.David Lankes It can be helpful to think of a RDF statement/triple as a sentence with a subject, verb and object So like Atlas is the subject, has creator is the verb and dave lankes is the object of the sentence
  • #8 We humans can read and understand this relationship, but a machine cannot, it just sees strings.So in order to make this relationship understandable to a computer, we need to define all of these parts for it. We do this with URIs In order to make is linked data, we need to provide some URIs for this information. Meaning a unique, permanent descriptor for all three components Luckily for us, there are a lot of online resources that can serve as URIs for things in the world, lists of controlled vocabularies that already exist. For example, a URI that we can use for the Atlas is it’s OCLC number. If you aren’t familiar with worldcat, or OCLC, it is basically the largest library catalog in the world, combining thousands of individual library catalogs. The Virtual International Authority File, or VIAF which is an international authority file exists, and has unique names for authors, works, etc. So for Dave we’ll use his unique VIAF number The Atlas is a book, and Dave Lankes is a person, easy unique nouns. But what about a URI for the concept “has creator”? There are actually many vocabularies that are defined and exist with URIs for the purpose of linked data and metadata creation. Dublin Core metadata schema has a concept of “has creator”, so we will use the URI for that concept. You can also have an object of a triple that does not have a URI and this is necessary for some things like dates, and those are called literals
  • #9 FOAF-friend of a friend, terms about people Schema.org was developed my major search engines DC terms is dublin core
  • #10 So here are some other examples with triples and how you would write these relationships. Rdf:type in english can basically be read as “is a…” depending on
  • #12 The structure is similar to SQL if you know it, but I would not say that if you know SQL you get SPARQL just by looking at it. Its kind of like SQL written in a triple format
  • #13 DBpedia data extracted from wikipedia in a linked data format Geonames is data extracted from openstreetmap.org Europeana is analogus to digital public library of America, a large digital collection of art, pictures, sounds, etc. from the continent that is publically available Data.gov.uk is data that the british goverment collects and makes publically availble, and most new datasets are being published in RDF, and JSON in addition to formats like csv, and HTML Bio2rdf is linked data for the life scienes Six Degrees of Francis Bacon is probably my favorite. It is a map of Tudor/Stewart Englad social networks. It maps relationships of people, starting with Sir Frances Bacon. There was a presentation on this last fall where the creators spoke about it and it was super cool. Do cool demo
  • #14 Here are published linked-open data sets and vocabularies as of 2014
  • #15 What does all this mean for libraries? First of all the end of MARC...eventually. Though people have been crying wolf for a long time about the death of MARC, but library schools are still teaching it, and catalogers are still using it. If you’ve googled anything in the last 4 years, you may have noticed that on the right hand side is graph search results, with images and key information. This comes from an initiative from the major search engines (yahoo, google, bing) creating schema.org in 2011, giving a way for web developers embed information about their website (like for example a library’s name, address, city, hours of operation etc) into the HTML as microdata. This helps search engines and web crawlers pick up this information, which is used to create the graph that you see. So the Library of Congress to go along with RDA and FRBR, is in the process of creating a similar concept of schema.org that is more fitting with library needs and. The eventual goal of BIBFRAME is to catalog using linked data principles and replace MARC Though this cloud is very small, and there is no way that you can read everything, if you look at it closely, you can see that a lot of the datasets are European. Libraries in the UK, Scandinavia and Germany have been leading the way in integrating RDF into their catalogs http://lj.libraryjournal.com/2015/02/technology/ending-the-invisible-library-linked-data/#_
  • #16 Rome wasn’t built in a day y’all There are bajillions of datasets that aren’t online. So for Linked Data to work, the data must be published online Similar to other issues that have come about given the information can be shared on the internet, publishing data that can be used freely by all http://www.semanticfocus.com/blog/entry/title/5-problems-of-the-semantic-web/ If you’re someone who actively not trying to index their information, or have their information indexed on the web, linked data can cause some concerns for you https://www.w3.org/DesignIssues/LinkedData.html http://blog.soton.ac.uk/webteam/2011/07/17/linked-data-vs-open-data-vs-rdf-data/ http://milicicvuk.com/blog/2011/07/26/problems-of-linked-data-14-identity/
  • #17 I’m not a total professional on this, but I can try to answer your questions and/or direct you to people who can. I