Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Metadata is back!
Bernhard Haslhofer - Cornell University
JCDL 2011 - Semantic Web Technologies for Libraries and Readers Workshop
Thursday, June 16th 2011
schema.org Book Example
<img src="catcher-in-the-rye-book-cover.jpg" />
The Catcher in the Rye - Mass Market Paperback
by <a href="/author/jd_salinger.html">J.D. Salinger</a>
Publisher: Little, Brown, and Company - May 1, 1991
Semantic Web - Early Vision
"Mom needs to see a specialist and then has to
have a series of physical therapy sessions.
Biweekly or something. I'm going to have my
agent set up the appointments."
“The Semantic Web will bring structure to the
meaningful content of Web pages, creating an
environment where software agents roaming
from page to page can readily carry out
sophisticated tasks for users”
“For the semantic web to function, computers
must have access to structured collections of
information and sets of inference rules that
they can use to conduct automated reasoning.”
Semantic Web Technologies
User Interface & Applications
Data Model: RDF
RDFa & Microformats
• Mechanisms to embed structured metadata in Web
• Deﬁne and/or reuse (X)HTML attributes to augment
information in Websites with machine-readable
• There is lots of information on the Web
• ... valuable information that can be (re-)used
• information is usually expressed in the form of HTML
• the underlying raw data are locked in closed data silos (mostly
Why Linked Data?
• The Web is successful because it provides
• Uniform encoding (HTML)
• Uniform addressing (URI)
• Uniform transportation (HTTP)
for the exchange of documents.
• Why not apply the same mechanism to the underlying
What is Linked Data?
• A pragmatic method to build a Web of Data
• Architectural style based on SW standards
• Intelligent agents not primary focus
• Distinguish between non-information and information
• Sample non-information resource
• Sample information resource
• http://dbpedia.org/page/The_Catcher_in_the_Rye - HTML
• http://dbpedia.org/data/The_Catcher_in_the_Rye - RDF
• A very young HTML 5 proposition that extends
Microformats and addresses its shortcomings
• Items are created within an itemscope
• Ever item is assigned an arbitrary number of
• Uses global identiﬁers for typing and naming items
Deal with with schema.org
• Ignore it?
• Adopt it?
• Align existing library models with schema.org?
• Schema.org provides an extension mechanism for
Data Quality / Resource Sync
• The Web is not static
• Resources and their representations might change or
disappear over time
• Make sure that
• applications can synchronize resources and learn about
• go back in time
Use Web Data in Apps
• Aggregate Web resources into special collections
• DBpedia provides resource descriptions translated
into 90+ languages!!!
• Use URIs instead of labels for tagging
• Combine and mesh up data
• Analyze data ...
Metadata is back
• Metadata was introduced in the 19th century to deal
with the information overload
• Cataloguing rules and workﬂows evolved over time
• The Web seemed to work pretty well without
metadata (info retrieval, nat.lang processing)
• Now we have strong indicators that structured
metadata on the Web will play an important role in
• Shouldn’t libraries / librarians be part of that?
• Coyle, K.: Library Data in a Modern Context. In: Understanding the
Semantic Web: Bibliographic Data and Metadata. Library Technology
Reports. January 2010
• http://blog.mediaspaces.info/ (Linked Data in Libraries State-of-the-Art)
Metadata Building Blocks
Title Author Genre
Title The Catcher in the Rye
Metadata Author Salinger, J.D.
Genre Fiction (Digital / Non-Digital)
Google Rich Snippet Types
• Businesses and organizations
ﬂat namespace XML namespaces
support HTML4, XHTML 1.1, and
support for XHTML 1.1
use latent HTML attributes introduces new metadata attributes
vocabulary deﬁned by one
open to any RDF-based vocabulary
303 See Other
<?xml version="1.0" encoding="utf-8"?>