An Introduction to
Linked Data for Librarians
Cliff Landis
Special Libraries Association - Georgia Chapter
2018-06-28
The Metadata Problem: Computers are
stupid
(they don’t understand meanings or relationships)
• What we see • What the computer sees
<head></head>
<body>
<background>
<image>
<headline>Text Text Text</headline>
<tool></tool>
<text>Text Text Text</text>
</body>
https://www.auctr.edu/
It all starts with proper goat
anatomy...
http://knowyourmeme.com/photos/956722-proper-anatomy
All metadata can be described as “triples” of
SUBJECT – PREDICATE - OBJECT
SUBJECT PREDICATE OBJECT
Goat speciesName Capra aegagrus
hircus
Goat conservationStatus Domesticated
Goat numberOfBreeds 300+
Goat distribution Global
Baby goat movementStatus Handsome_pose!
Baby goat hasCutenessLevel ZOMG!!!!1!
https://pixabay.com/en/goat-kid-farm-cute-baby-adorable-2403566/ https://www.wikidata.org/wiki/Q2934
Challenge: OED has 19 definitions of “Plant”
Let’s make the Web a little smarter!
A young tree, shrub, vegetable,
or flower newly planted, or
intended for planting; a set, a
cutting, a seedling. Now chiefly
Eng. regional (midl. and south.)
and Irish English (north.): a young
cabbage plant. (from the OED)
=
Step 1: Disambiguation
Oh, “Plant!” You mean…
• http://www.wikidata.org/wiki/Q756
• http://dbpedia.org/ontology/Plant
• http://id.loc.gov/authorities/subjects/sh85102839
• https://www.jstor.org/topic/plants/
• http://vocab.getty.edu/aat/300132360
Plant
dbpedia
Wikidat
a
LCSH
JSTOR
Topics
AAT
Kingdom
Part of
Subdomain
Instance
of
Taxonomic
Rank
Step 2: Linking relationships
Plant
Taxon
Name
Plantae
Taxon
Rank
Kingdom
BNCF
Number
1037
Aiming for “5 star” Linked Open Data
● Available on the Web (whatever format) with an open license
(CC0, PD, ODC, RightsStatements.org)
● As above, but as machine-readable structured data (Excel
spreadsheets)
● As above, but in a non-proprietary format (CSV instead of
Excel)
● As above, but use URIs to denote things, so that people can
point at your stuff (persistent URIs, RDF, SPARQL)
● As above, but linking your data out to other data to provide
context (Linked Open Data)
http://5stardata.info/
Linked Open Data (2008-02-28)
https://lod-cloud.net/versions/2008-02-28/lod-cloud.png
Linked Open Data (2018-04-30)
"Linking Open Data cloud diagram 2018, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/"
1,184 datasets with 15,993 links
Library of Congress Subject Headings
http://lod-cloud.net/clouds/lod-cloud.svg
http://lod-cloud.net/clouds/lod-cloud.svg
DBpedia
http://colab.cim3.net/file/work/SICoP/2007-09-20/MDavis09202007.pdf
Neat! But what does that have to do
with libraries?
https://www.auctr.edu/
https://www.google.com/search?q=robert+w+woodruff+library
https://duckduckgo.com/?q=robert+w+woodruff+library
Facts can be described with metadata
AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/
Facts can be described with metadata
AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/
AUC Robert W. Woodruff Library has a website https://www.auctr.edu/
Facts can be described with metadata
AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/
AUC Robert W. Woodruff Library has a website https://www.auctr.edu/
SUBJECT PREDICATE OBJECT
Facts can be described with metadata
AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/
AUC Robert W. Woodruff Library has a website https://www.auctr.edu/
SUBJECT PREDICATE OBJECT
http://dbpedia.org/page/Robert
_W._Woodruff_Library,_Atlanta
_University_Center
http://xmlns.com/
foaf/spec/#term_h
omepage
https://www.auctr.edu/
Robert W. Woodruff Library, Atlanta University Center Homepage
Triples, Triples Everywhere!!!
SUBJECT PREDICATE OBJECT
http://dbpedia.org/page/Robert_W._Wood
ruff_Library,_Atlanta_University_Center
foaf:homepage https://www.auctr.edu
http://dbpedia.org/page/Robert_W._Wood
ruff_Library,_Atlanta_University_Center
foaf:isPrimaryTopicOf https://en.wikipedia.org/wiki/Rob
ert_W._Woodruff_Library,_Atlant
a_University_Center
http://dbpedia.org/page/Robert_W._Wood
ruff_Library,_Atlanta_University_Center
dbp:established 1982 (xsd:integer)
http://dbpedia.org/page/Robert_W._Wood
ruff_Library,_Atlanta_University_Center
georss:point 33.75138888888889 -
84.41333333333333
Formats:
• RDF/XML
• JSON-LD
• Turtle
• N-Triples
• RDFa
Query:
• SPARQL
• API
Leveraging our unique resources
Pratt: Linked Jazz
http://3.bp.blogspot.com/-yZSvxCMvZos/Uxjt3mVK87I/AAAAAAAAAFc/eIObi1hzeDU/s1600/Linked+Jazz.jpg
https://journal.code4lib.org/articles/8670
Laurentian U. catalog: Wikidata infocards
http://journal.code4lib.org/articles/13424
UTSC: Dragoman Renaissance Research Platform
http://dragomans.digitalscholarship.utsc.utoronto.ca/sites/default/files/inline-images/004.png
AUC RWWL:
Omeka S
(alpha)
https://omeka.org/s/
https://omeka.org/s/
Controlled Vocabularies as LOD URIs
https://omeka.org/s/modules/ValueSuggest/
https://omeka.org/s/docs/developer/key_concepts/api/
Neat! But seriously, what does that
have to do with my library?
Don’t worry….We’re getting there….
https://www.auctr.edu/
Improve
findability &
outreach with
Wikipedia
https://en.wikipedia.org/wiki/Robert_W._Woodr
uff_Library,_Atlanta_University_Center
...But follow Wikipedia COI Best Practices!
● Infobox
● Abstract
● Wikidata Item*
● References
http://live.dbpedia.org/page/Robert_W._Woodruff_Library,_Atlanta_University_Center
https://www.google.com/search?q=robert+w+woodruff+library
Don’t just link out...remember to link in
http://snaccooperative.org/view/83377325
http://www.worldcat.org/oclc/51865434
We’re sitting on a data goldmine
(but it has to be mined!)
•We can leverage our libraries’ unique content to
provide linked open data for others to remix and
reuse
•We can provide data we generate (statistics, analytics,
door count, etc.) as linked open data for us to mine,
and for others to mine nationally and globally
•We can use tools to generate information
visualizations of our own data and that of others
...but I’m also a realist
• Linked open data is open. Once you publish data, you can’t control
how it will be used -- like walking in open spaces, you can’t control
who will incidentally photograph you
• Early adoption at large scales is often slow, difficult, and expensive
(but it’s finally starting to get easier and cheaper!)
• Managing linked open data is like any other metadata work -- it
requires work to setup, and regular maintenance to keep up
• Like all areas in the information economy, problems with quality
control and authority control (e.g. “fake news”) are migrating from
old media to new media -- requiring constant vigilance to ensure
you’re using trustworthy sources
Okay, okay, so now what?
Where to go from here
https://www.auctr.edu/
Semantic Web for beginners
WATCH
• Linked Open Data – What is it? – An introduction to LOD for memory organization
workers
• Tim Berners-Lee: The Next Web – Creator of the Web talks about its next stage of
evolution
• Linked Open Jazz – Using oral histories to explore relationships between artists
BROWSE
• WikiData.org – browse to get a feel for the subject-predicate-object relationships
• DBpedia.org – browse to get a feel for the use of LOD metadata standards
• lod-cloud.net - browse LOD datasets available for exploration and use
• SNAC Cooperative - browse how historical figures are connected through separate
archival collections
• LD4PE - browse Linked Data for Professional Educators to find learning resources
https://www.flickr.com/photos/jenosaur/4051305996/

An Introduction to Linked Data for Librarians (2018-06-28)

  • 1.
    An Introduction to LinkedData for Librarians Cliff Landis Special Libraries Association - Georgia Chapter 2018-06-28
  • 2.
    The Metadata Problem:Computers are stupid (they don’t understand meanings or relationships) • What we see • What the computer sees <head></head> <body> <background> <image> <headline>Text Text Text</headline> <tool></tool> <text>Text Text Text</text> </body> https://www.auctr.edu/
  • 3.
    It all startswith proper goat anatomy...
  • 4.
  • 5.
    All metadata canbe described as “triples” of SUBJECT – PREDICATE - OBJECT SUBJECT PREDICATE OBJECT Goat speciesName Capra aegagrus hircus Goat conservationStatus Domesticated Goat numberOfBreeds 300+ Goat distribution Global Baby goat movementStatus Handsome_pose! Baby goat hasCutenessLevel ZOMG!!!!1! https://pixabay.com/en/goat-kid-farm-cute-baby-adorable-2403566/ https://www.wikidata.org/wiki/Q2934
  • 6.
    Challenge: OED has19 definitions of “Plant”
  • 7.
    Let’s make theWeb a little smarter! A young tree, shrub, vegetable, or flower newly planted, or intended for planting; a set, a cutting, a seedling. Now chiefly Eng. regional (midl. and south.) and Irish English (north.): a young cabbage plant. (from the OED) = Step 1: Disambiguation
  • 8.
    Oh, “Plant!” Youmean… • http://www.wikidata.org/wiki/Q756 • http://dbpedia.org/ontology/Plant • http://id.loc.gov/authorities/subjects/sh85102839 • https://www.jstor.org/topic/plants/ • http://vocab.getty.edu/aat/300132360 Plant dbpedia Wikidat a LCSH JSTOR Topics AAT
  • 9.
    Kingdom Part of Subdomain Instance of Taxonomic Rank Step 2:Linking relationships Plant Taxon Name Plantae Taxon Rank Kingdom BNCF Number 1037
  • 10.
    Aiming for “5star” Linked Open Data ● Available on the Web (whatever format) with an open license (CC0, PD, ODC, RightsStatements.org) ● As above, but as machine-readable structured data (Excel spreadsheets) ● As above, but in a non-proprietary format (CSV instead of Excel) ● As above, but use URIs to denote things, so that people can point at your stuff (persistent URIs, RDF, SPARQL) ● As above, but linking your data out to other data to provide context (Linked Open Data) http://5stardata.info/
  • 11.
    Linked Open Data(2008-02-28) https://lod-cloud.net/versions/2008-02-28/lod-cloud.png
  • 12.
    Linked Open Data(2018-04-30) "Linking Open Data cloud diagram 2018, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/" 1,184 datasets with 15,993 links
  • 13.
    Library of CongressSubject Headings http://lod-cloud.net/clouds/lod-cloud.svg
  • 14.
  • 15.
  • 16.
    Neat! But whatdoes that have to do with libraries? https://www.auctr.edu/
  • 17.
  • 18.
  • 19.
    Facts can bedescribed with metadata AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/
  • 20.
    Facts can bedescribed with metadata AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/ AUC Robert W. Woodruff Library has a website https://www.auctr.edu/
  • 21.
    Facts can bedescribed with metadata AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/ AUC Robert W. Woodruff Library has a website https://www.auctr.edu/ SUBJECT PREDICATE OBJECT
  • 22.
    Facts can bedescribed with metadata AUC Robert W. Woodruff Library has a website at https://www.auctr.edu/ AUC Robert W. Woodruff Library has a website https://www.auctr.edu/ SUBJECT PREDICATE OBJECT http://dbpedia.org/page/Robert _W._Woodruff_Library,_Atlanta _University_Center http://xmlns.com/ foaf/spec/#term_h omepage https://www.auctr.edu/ Robert W. Woodruff Library, Atlanta University Center Homepage
  • 23.
    Triples, Triples Everywhere!!! SUBJECTPREDICATE OBJECT http://dbpedia.org/page/Robert_W._Wood ruff_Library,_Atlanta_University_Center foaf:homepage https://www.auctr.edu http://dbpedia.org/page/Robert_W._Wood ruff_Library,_Atlanta_University_Center foaf:isPrimaryTopicOf https://en.wikipedia.org/wiki/Rob ert_W._Woodruff_Library,_Atlant a_University_Center http://dbpedia.org/page/Robert_W._Wood ruff_Library,_Atlanta_University_Center dbp:established 1982 (xsd:integer) http://dbpedia.org/page/Robert_W._Wood ruff_Library,_Atlanta_University_Center georss:point 33.75138888888889 - 84.41333333333333
  • 24.
    Formats: • RDF/XML • JSON-LD •Turtle • N-Triples • RDFa Query: • SPARQL • API
  • 25.
  • 26.
  • 27.
  • 28.
    Laurentian U. catalog:Wikidata infocards http://journal.code4lib.org/articles/13424
  • 29.
    UTSC: Dragoman RenaissanceResearch Platform http://dragomans.digitalscholarship.utsc.utoronto.ca/sites/default/files/inline-images/004.png
  • 30.
  • 31.
  • 32.
    Controlled Vocabularies asLOD URIs https://omeka.org/s/modules/ValueSuggest/
  • 33.
  • 34.
    Neat! But seriously,what does that have to do with my library? Don’t worry….We’re getting there…. https://www.auctr.edu/
  • 35.
    Improve findability & outreach with Wikipedia https://en.wikipedia.org/wiki/Robert_W._Woodr uff_Library,_Atlanta_University_Center ...Butfollow Wikipedia COI Best Practices! ● Infobox ● Abstract ● Wikidata Item* ● References
  • 36.
  • 37.
  • 38.
    Don’t just linkout...remember to link in
  • 39.
  • 40.
  • 41.
    We’re sitting ona data goldmine (but it has to be mined!) •We can leverage our libraries’ unique content to provide linked open data for others to remix and reuse •We can provide data we generate (statistics, analytics, door count, etc.) as linked open data for us to mine, and for others to mine nationally and globally •We can use tools to generate information visualizations of our own data and that of others
  • 42.
    ...but I’m alsoa realist • Linked open data is open. Once you publish data, you can’t control how it will be used -- like walking in open spaces, you can’t control who will incidentally photograph you • Early adoption at large scales is often slow, difficult, and expensive (but it’s finally starting to get easier and cheaper!) • Managing linked open data is like any other metadata work -- it requires work to setup, and regular maintenance to keep up • Like all areas in the information economy, problems with quality control and authority control (e.g. “fake news”) are migrating from old media to new media -- requiring constant vigilance to ensure you’re using trustworthy sources
  • 43.
    Okay, okay, sonow what? Where to go from here https://www.auctr.edu/
  • 44.
    Semantic Web forbeginners WATCH • Linked Open Data – What is it? – An introduction to LOD for memory organization workers • Tim Berners-Lee: The Next Web – Creator of the Web talks about its next stage of evolution • Linked Open Jazz – Using oral histories to explore relationships between artists BROWSE • WikiData.org – browse to get a feel for the subject-predicate-object relationships • DBpedia.org – browse to get a feel for the use of LOD metadata standards • lod-cloud.net - browse LOD datasets available for exploration and use • SNAC Cooperative - browse how historical figures are connected through separate archival collections • LD4PE - browse Linked Data for Professional Educators to find learning resources
  • 45.