• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
RDFa From Theory to Practice
 

RDFa From Theory to Practice

on

  • 2,311 views

Talk given at Institutional Web Management Workshop 2010, University of Sheffield

Talk given at Institutional Web Management Workshop 2010, University of Sheffield

Statistics

Views

Total Views
2,311
Views on SlideShare
2,024
Embed Views
287

Actions

Likes
2
Downloads
24
Comments
0

5 Embeds 287

http://blogs.ukoln.ac.uk 215
http://iwmw.ukoln.ac.uk 41
http://archiveshub.ac.uk 27
http://translate.googleusercontent.com 3
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Screenscraping, Google algorithms, but still not ideal
  • Context of openess – MPs expenses etc
  • The Central Office of Information The Central Office of Information (COI) is the Government's centre of excellence for marketing and communications.
  • data.gov site, which simply provides access to raw data (Excel spreadsheets, PDF files, and more), the UK is adhering closely to Berners- Lee’s Linked Data rules and making data available in formats such as RDF where feasible.
  • Next few slides from demos at data.gov.uk launch Jan 21 st 2010
  • Principles underpinning the technology
  • Step back a bit to HTML HTML web of documents doesn’t encourage re-use, reduce redundancy. Are network effects but could be much better.
  • Note this is a considerable simplification of the detail in danger of misleading. Linked data exploits semantically meaningful tagging to encourage re-use, reduce redundancy etc.
  • Uses predicate logic. Goes back to Aristotle. Conceptualises things, and the relationships between things
  • SparqPlug (Coetzee,HeathandMotta,2008) is a servicethatenables the extraction of Linked Data from legacy HTML documents on the Web that do not contain RDF data. The service operates by serialising the HTML DOM as RDF and allowing users to define SPARQL queries that transform elements of this into an RDF graph of their choice
  • D2R - Using a declarative mapping language, the data publisher defines a mapping between the relational schema of the database and the target RDF vocabulary. Based on the mapping, D2R server publishes a Linked Data view over the database and allows clients to query the database via the SPARQL protocol.
  • Just as traditional Web browsers allow users to navigate between HTML pages by following hypertext links, Linked Data browsers allow users to navigate between data sources by following links expressed as RDF triples. Linked Data search engines provide keyword-based search services oriented towards human users, and follow a similar interaction paradigm to existing market leaders such as Google and Yahoo.
  • Dots indicate provenance The colour of the dots doesn’t seem to be of significance
  • Falcons provide a more detailed interface to the user that exploits the underlying structure of the data. Both provide a summary of the entity the user selects from the results list, alongside additional structured data crawled from the Web and links to related entities. Falcons provides users with the option of searching for objects, concepts and documents, each of which leads to slightly different presentation of results.
  • Sindice (Oren et al, 2008) and Watson () provide APIs through which Linked Data applications can discover RDF documents on the Web that reference a certain URI or contain certain keywords. The rationale for such services is that each new Linked Data application should not need to implement its own infrastructure for crawling and indexing all parts of the Web of Data of which it might wish to make use. Instead, applications can query these indexes to receive pointers to potentially relevant documents which can then be retrieved and processed by the application itself.
  • Was covered at CETIS conference in 2009 I’d be interested to get any ideas on this

RDFa From Theory to Practice RDFa From Theory to Practice Presentation Transcript

  • UKOLN is supported by: RDFa From Theory to Practice - Part 1 - A gentle introduction to Linked Data and the Semantic Web? 12th July 2010 Institutional Web Management Workshop 2010 University of Sheffield, UK Adrian Stevenson
    • semantics is … devoted to the study of meaning … on the syntactic levels of words, phrases, sentences
    • http://en.wikipedia.org/wiki/Semantic
    • “ The Semantic Web is a web of data , in some ways like a global database” 1
    • “ first step is putting data on the Web in a form that machines can naturally understand...  This creates what I call a Semantic Web - a web of data that can be processed directly or indirectly by machines” 2
    • 1. http://www.w3.org/DesignIssues/Semantic.html
    • 2. Tim Berners-Lee, Weaving the Web . Harper, San Francisco. 1999.
    • “ The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web.”
    • “ the Semantic Web is the goal or end result… Linked Data provides the means to reach that goal”
    • From ‘ Linked Data: The Story So Far ’ - Heath, Bizer and Berners-Lee 2009
  • The Web We’re Used To
    • Made by humans for humans
    • Primarily documents
    • Machines not very welcome
    • Data silos
  • Web of Linked Data
    • In 1998 the idea from Tim Berners-Lee of ‘linked data’ took shape
    • Designed for machines first
    • It primarily links data about ‘things’, not documents
    • … but it is for humans in the end
    • But haven’t we been putting data on the web for years?
      • In CSV , relational databases, XML etc?
    • Well yes, but these approaches are not so easy to integrate
    • Web 2.0 mashups work against a fixed set of data sources
    • Linked Data applications operate on top of an unbound, global data space.
  • So what’s happening now?
  •  
    • “ Sir Tim Berners-Lee, the inventor of the world wide web, will help the British government to make its data more easily available online … I have asked Sir Tim Berners-Lee … to help us drive the opening up of access to Government data in the web” Prime Minister Gordon Brown, 10 th June 2009
    • "What you find if you deal with people in government departments is that they hug their database, hold it really close”. Tim Berners-Lee, 10 th June 2009
    • An Institute of Web Science proposed
    • Why?
      • Openness – MPs expenses, etc.
      • Saving money
  • http://www.guardian.co.uk/technology/2010/may/25/berners-lee-institute-web-science-statement
  • http://www.ecs.soton.ac.uk/about/news/3223
  • Data.gov.uk Officially launched 21 st January 2010
  • Data.gov.uk – search for ‘traffic’
  • Central Office of Information - http://coi.gov.uk/
  • BBC Music BETA http://www.bbc.co.uk/music/developers
    • Provides access to raw data (Excel spreadsheets, PDF files, and more)
    • UK is adhering more closely to Berners- Lee’s Linked Data rules
  • http://www.readwriteweb.com/archives/cnet_partners_with_thomson_reuters_on_linked_data.php
  • http://open.blogs.nytimes.com/2009/06/26/nyt-to-release-thesaurus-and-enter-linked-data-cloud/
  • Graphs house prices over time - combines house price data with information from Yahoo! Placemaker, Nestoria and OpenStreetMap
  •  
  • Postcode Paper - bus timetables, doctors surgeries, allotments http://blog.newspaperclub.co.uk/2009/10/16/data-gov-uk-newspaper/
  • Owls Near You - http://owlsnearyou.com/
  • 12 month project funded by JISC 2/10 jiscExpo call http://blogs.ukoln.ac.uk/locah/ http://www.twitter.com/projectlocah tag: #locah
  • http://richard.cyganiak.de/2007/10/lod/
  • A little bit of the techy stuff
  • Linked Data is …
    • A way of publishing data on the web that:
      • Encourages reuse
      • Reduces redundancy
      • Maximises inter-connectedness
      • Enables network effects
    • So how is this achieved?
  • Presentational tagging – HTML
    • <h1>Agilitas Physiotherapy Centre</h1> <p>Welcome to the Agilitas Physiotherapy Centre home page. Do you feel pain? Have you had an injury? Let our staff Lisa Davenport, our secretary Kelly Townsend, and Steve Matthews take care of your body and soul.</p> <h2>Consultation hours</h2> Mon 11am - 7pm<br/> Tue 11am - 7pm<br/> Wed 3pm - 7pm<br/> Thu 11am - 7pm<br/> Fri 11am - 3pm
    • <p> But note that we do not offer consultation during the weeks of the <a href=&quot;. . .&quot;>State Of Origin</a> games.</p>
  • Semantic tagging
    • <company>
    • <treatmentOffered>Physiotherapy</treatmentOffered>
    • <companyName>Agilitas Physiotherapy Centre</companyName>
    • <staff>
    • <therapist>Lisa Davenport</therapist> <therapist>Steve Matthews</therapist>
    • <secretary>Kelly Townsend</secretary>
    • </staff>
    • </company>
  • Tim BL’s Linked Data Design Issues
    • Use URIs as names for things
    • Use HTTP URIs so that people can look up those names.
    • When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
    • Include links to other URIs so that they can discover more things.
    • From http://www.w3.org/DesignIssues/LinkedData.html
  • URIs and HTTP
    • A “Uniform Resource Identifier (URI) provides a simple and extensible means for identifying a resource –RFC 3986
      • A URL is a type of URI
      • HTTP URIs can be ‘de-referenced’
    • HTTP URIs are used for “real world” things
      • http://adrianstevenson.com/id/me
      • http://dbpedia.org/page/Tim_Berners-Lee
  • RDF
    • Resource Description Framework
      • “ a language for representing information about resources in the World Wide Web”
      • “ RDF can also be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web”
    • Describes relations based on triples
      • S ubject-object-predicate
    • http://www.w3.org/TR/REC-rdf-syntax/
    • Heroes
    • has a
    • creator
    • whose name is
    • David Bowie
    Subject Predicate Object
  • RDFa
    • ‘ Resource Description Framework in attributes’
    • Adds attribute level extensions to XHTML
    • Enables embedding RDF triples within XHTML
    • Google and Yahoo process RDFa
  • RDFa example
    • <html xmlns:dc=&quot;http://purl.org/dc/terms/&quot;>
    • <head>
    • <title>RDFa: From Theory to Practice</title>
    • </head>
    • <body>
    • <h1>RDFa: From Theory to Practice</h1>
    • Author: <em property=&quot;dc:creator&quot; content=”Adrian Stevenson&quot;>Adrian Stevenson</em>
    • Created: <em property=&quot;dc:created&quot; content=&quot;2010-07-12&quot;> July 12th, 2010</em>
    • License: <a rel=&quot;license&quot; href=&quot;http://creativecommons.org/licenses/ » by-sa/3.0/&quot;>CC Attribution-ShareAlike</a>
    • </body>
    • </html>
  • RDFa example
    • <html xmlns:dc=&quot;http://purl.org/dc/terms/&quot;>
    • <head>
    • <title>RDFa: From Theory to Practice</title>
    • </head>
    • <body>
    • <h1>RDFa: From Theory to Practice</h1>
    • Author: <em property=&quot;dc:creator&quot; content=”Adrian Stevenson&quot; >Adrian Stevenson</em>
    • Created: <em property=&quot;dc:created&quot; content=&quot;2010-07-12&quot; > July 12th, 2010</em>
    • License: <a rel=&quot;license&quot; href=&quot;http://creativecommons.org/licenses/ » by-sa/3.0/&quot;>CC Attribution-ShareAlike</a>
    • </body>
    • </html>
  • Linked Data in Use
  • Publishing Linked Data
    • RDFizers – convert data formats into RDF
    • D2R Server – creates linked data from relational databases
    • SparqPlug – Extracts linked data from HTML
    • … . Many others
  •  
  •  
  • D2R server publishes Linked Data view of database and allows clients to query the database via SPARQL
  • Linked Data Applications
    • Linked Data Browsers – navigate between data sources
      • Disco
      • Tabulator
      • Marbles
    • Linked Data Search Engines
      • For humans – Falcons, SWSE
      • For apps – Swoogle, Sindice
    • Tracks provenance of data
    • Merges data about the same thing from different sources
    http://marbles.sourceforge.net/
    • User can explore the underlying data structures
    • Can search for objects, concepts or documents
    http://iws.seu.edu.cn/services/falcons/
    • Provides interface (API) that other linked data apps can use
    • Rationale: new linked data apps shouldn’t need to implement their own infrastructure for crawling and indexing web of data
    http://sindice.com/
  • http://sindice.com/search?q=jazz&qt=term
  •  
  • Some issues
    • To RDF or not to RDF
    • Usability
    • Sustainability
    • Provenance
    • Licensing
    • Reliability
  •  
  •  
  •  
  •  
  •  
  •  
  • Sustainability
    • Ed Summers at the Library of Congress created http://lcsh.info
    • Linked Data interface for LOC subject headings
    • People started using it
  • Library of Congress Subject Headings
  •  
  • Data Licensing
    • Uses Amazon Web Services but contravenes their terms and conditions
    http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/
  • Provenance
    • OK if data ‘watermarked’
    • But can often be a problem
    • VOID can help
  •  
    • Can we convince IT Managers, VC etc. it’s worth it?
      • Realistic expectations
      • “ ..the people sort of in charge of the kind of data thing knew so little about their data structures”
      • “ I’ve had a whole bunch of meetings to get one dataset, been fobbed off, and literally just never get anywhere” Tom Steinberg, Director of MySociety (from Nodalities issue 8)
    The Business Case
    • What’s the payoff for O’Reilly, BBC etc of using Linked Data?
    • Why didn’t it work the first time?
      • What’s different now?
      • Need to work out what Linked Data does that other things don’t
      • prove a simple tangible benefit
    The Business Case
  • Universities and Colleges in the Giant Global Graph
    • Session at CETIS Conference 2009
    • Case for Linked Data / Semantic Web discussed
    • Some cases:
      • Freedom of Information
      • Improves data quality
      • Joining the party
    http://wiki.cetis.ac.uk/Universities_and_Colleges_in_the_Giant_Global_Graph
  • http://wiki.cetis.ac.uk/Image:Conf2009_GGG_Group1B.jpg
  • Conclusion
    • Interesting developments and sense of momentum
    • Central Gov’t still seem committed
    • JISC is funding 10 Linked Data projects starting around July 2010
    • … but still much to do if the semantic web and linked data are to really take hold
  • Questions?
    • http://blogs.ukoln.ac.uk/adrianstevenson
    • http://www.twitter.com/adrianstevenson
    • [email_address]
  • CC Attribution
    • Some sections of this presentation adapted from:
      • An Introduction to Linked Data , by Tom Heath
      • The Semantic Web – An Introduction by Owen Stephens
      • Using Linked Data as a Learning Resource Recommendation System by Chris Clarke
    • This presentation available under creative commons Noncommercial-Share Alike