Your SlideShare is downloading. ×
0
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data

6,773

Published on

Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research …

Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,773
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
129
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Initially, the Web consisted of many Websites containing only unstructured/textual content.The Web 2.0 extended this traditional Web with few extremely large Websites specializing on certain specific content types. Examples are:Wikipedia for encyclopedic articlesDel.icio.us for Web linksFlickr for picturesYouTube for VideosDigg for news…These websites provide specialized searching, querying, sharing, authoring specifically adopted to the content type of their concern.
  • Popular content types such as pictures, movies, calendars, encyclopedic articles, news recipes etc. are already sufficiently well supported on the Web.However, there is a long tail of special-interest content (profiles of expertise, historic data and events, bio-medical knowledge, intra-corporational knowledge etc.) which has very low or no current support (for filtering, aggregation, searching, querying, collaborative editing) on the Web.
  • http://www.flickr.com/photos/residae/2560241604/#/
  • http://www.flickr.com/photos/alasis/3541341601/sizes/l/in/photostream/
  • Transcript

    • 1. DBpedia and the Emerging Web ofLinkedDataSören Auer
    • 2.
      • 2000 Mathematics and Computer Science studies inHagen, Dresden and Екатеринбург
      • 3. Managing director of adVIS GmbH – SME focusedon Web-Application and Content Management technology
      • 4. IT consultant for various companies (T-Mobile AG, RDL Corp., Science Computing AG)
      • 5. 2006 doctorate in Information Systems / Computer Science atUniversität Leipzig
      • 6. 2006-2008 post-doctoral researcher at the DB Group at University of Pennsylvania (USA)
      • 7. Head of AKSW research group – DBpedia, OntoWiki, LinkedGeoData, Triplify
      • 8. Research interests: Information Systems, Database and Web Technologies, Semantic Web and Knowledge Engineering, Adaptive Methodologies, HCI, E-Science, Digital Libraries
      • 9. Coordinator of the EU FP7 IP Project “LOD2 – Creating Knowledge out of Interlinked Data”
      • 10. Work as expert for W3C, EU FP6/FP7/CIP, University City Keystone Innovation Zone, Swiss National Science Foundation
      Dr. SörenAuer
    • 11. The Vision & Big Picture
      Linked Data 101
      The LinkedData Life-cycle
      Agenda
    • 12. 1. Reasoningdoes not scale on the Web
      • IR / one dimensional indexingscales (Google)
      • 13. Next stepconjunctivequerying (OWL-QL?, dynamicscale-out / clustering)
      • 14. Web scalable DL reasoningis out-of-sight (maybefragment, fuzzyreasoninghassomechances)
      2. Ifitwouldscaleitwould not beaffordable
      • “What is the only former Yugoslav republic in theEuropean Union?”
      • 15. 2880 POWER7 cores, 16 Terabytes memory, 4 Terabytes clustered storage (IBM Watson) still can not answer this question
      WhytheSemantic Web won‘twork (soon)
    • 16. Problem: Try to search for these things on the current Web:
      • Apartments near German-Russian bilingual childcare in Berlin.
      • 17. ERP service providers with offices in Vienna and London.
      • 18. Researchers working on multimedia topics in Eastern Europe.
      Informationis available on the Web, but opaque to current search.
      Why do we need the Data Web?
      Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources:
      Search engine
      HTML
      HTML
      RDF
      RDF
      Web server
      Web server
      Web server
      Web server
      berlin.de
      Has everything about childcare in Berlin.
      Immobilienscout.de
      Knows all about real estate offers in Germany
      DB
      DB
    • 19. From the Document Web to theSemantic Data Web
      Semantic Web(Vision 1998, starting ???)
      Data Web (since 2006)
      • URI de-referencability
      • 22. Web Data integration
      • 23. RDF serializations
      Social Web (since 2003)
      • Folksonomies/Tagging
      • 24. Reputation, sharing
      • 25. Groups, relationships
      Web (since 1992)
      • HTTP
      • 26. HTML/CSS/JavaScript
    • Web 1.0 Web 2.0 Web 3.0
      Many Web sitescontaining unstructured,textual content
      Few large Web sites are specialized onspecific content types
      Many Web sites containing & semantically syndicating arbitrarily structured content
      Video
      Pictures
      Encyclopedicarticles
      +
      +
    • 27. The Long Tail of Information Domains
      Pictures
      The Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domains
      Recipes
      News
      Video
      Calendar
      Popularity
      SemWeb supported structured content
      Requirements-Engineering
      Talentmanagement
      Special interest
      communities
      Itinerary ofKing George
      Gene
      sequences




      Currently supportedstructuredcontent types
      Not or insufficiently supported content types
    • 28. 1. Uses RDF Data Model
      Linked Data in a Nutshell
      3.10.2011
      starts
      organizes
      SBC
      SBBD2011
      Florianopolis
      takesPlaceIn
      2. Is serialised in triples:
      SBC organizes SBBD2011
      SBBD2011 starts “20111003”^^xsd:date
      SBBD2011 takesPlaceAt Florianopolis
      3. Uses Content-negotiation
    • 29. The emerging Web of Data
      interlink
      2009
      2007
      SILK
      DXX Engine
      fuse
      create
      2008
      poolparty
      SemMF
      OntoWiki
      2008
      Sigma
      WiQA
      2008
      2008
      ORE
      repair
      classify
      Virtouso
      2009
      DL-Learner
      MonetDB
      Sindice
      enrich
    • 30. Conceptual LevelData Access and Integration
      Data Warehousing
      • Based on extract, transform load (ETL)
      • 31. Global-As-View (GAV)
      Data Web
      • URIs as entity identifiers
      • 32. HTTP as data accessprotocol
      • 33. Local-As-View (LAV)
      Enterprise Information Integration
      sets of heterogeneous data sources appear as a single, homogeneous data source
      Research
      Mediators
      Ontology-based
      P2P
      Web service-based
      Data Integration
      Object-relational mappings (ORM)
      • NeXT’s EOF / WebObjects
      • 34. ADO.NET Entity Framework
      • 35. Hibernate
      Procedural APIs
      Query Languages
      Linked Data
      • de-referencable URIs
      • 39. RDF serialization formats
      Data Access
      Others
      • XML, hierachical, tree, graph-oriented DBMS
      RDBMS
      • Organize data in relations, rows, cells
      • 40. Oracle, DB2, MS-SQL
      Column-oriented DBMS
      • Collocates column values rather than row values
      • 41. Vertica, C-Store, MonetDB
      Triple/Quad Stores
      • RDF data model
      • 42. Virtuoso, Oracle, Sesame
      Data Models
      Entity-attribute-value (EAV)
      • HELP medical record system, TrialDB
    • The Vision & Big Picture
      Linked Data 101(based on Michael Hausenblas‘ slides)
      The LinkedData Life-cycle
      Agenda
    • 43. Orientation
    • 44. Linked Data 101
      Linked Data provides a standardised API for:
      Data and metadata discovery
      Data integration
      Distributed query
    • 45. Linked Data principles
      Use URIs to identify the “things” in your data
      Use http:// URIs so people (and machines) can look them up on the web
      When a URI is looked up, return a description ofthe thing (in RDF format)
      Include links to related things
      http://www.w3.org/DesignIssues/LinkedData.html
    • 46. Linked Data principles
      They are principles, not implementation advices
      Not humans or machines but humans and machines!
      Content negotiation (e.g. HTML and RDF/XML)
      HTML+ RDFa
      Metcalfe’s Lawhttp://en.wikipedia.org/wiki/Metcalfe%27s_law
    • 47. Linked Data example
      17
    • 48. HTTP URIs
      A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. [RFC3986]
      Syntax
      URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
      Example
      foo://example.com:8042/over/there?name=ferret#nose
      _/ _________________/_________/ __________/ __/
      | | | | |
      scheme authority path query fragment
    • 49. HTTP URIs
      URI references
      An RDF URI reference is a Unicode string does not contain any control characters (#x00 - #x1F, #x7F-#x9F) and would produce a valid URI character sequence representing an absolute URI when subjected to an UTF-8 encoding along with %-escaping non-US-ASCII octets.
      Qualified Names (QNames)
      XML’s way to allow namespaced elements/attributes as of QName = Prefix ‘:‘ LocalPart
      Compact URIs (CURIEs)
      Generic, abbreviated syntax for expressing URIs
    • 50. HTTP
      The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems.
      It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.
    • 51. HTTP
      HTTP messages consist of requests from client to server and responses from server to client
      Set of methods is predefined
      GET
      POST
      PUT
      DELETE
      HEAD
      (OPTIONS)
    • 52. HTTP
      Status codes
      Informational 1xx, provisional response, (100 Continue)
      Successful 2xx, request successfully received, understood, and accepted (201 Created)
      Redirection 3xx, further action needs to be taken by user agent to fulfill the request (301 Moved Permanently)
      Client Error 4xx, client erred (405 Method Not Allowed)
      Server Error 5xx, server encountered an unexpected condition (501 Not Implemented)
    • 53. HTTP
      GET /html/rfc2616 HTTP/1.1
      Host: tools.ietf.org
      User-Agent: Mozilla/5.0
      Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
      HTTP/1.x 200 OK
      Date: Thu, 05 Mar 2009 08:17:33 GMT
      Server: Apache/2.2.11
      Content-Location: rfc2616.html
      Last-Modified: Tue, 20 Jan 2009 09:16:04 GMT
      Content-Type: text/html; charset=UTF-8
      REQUEST
      RESPONSE
    • 54. HTTP
      Content Negotiation: selecting representation for a given response when multiple representations available
      Three types of CN: server-driven, agent-driven CN, transparent CN
      Example:
      curl -I -H "Accept: application/rdf+xml" http://dbpedia.org/resource/Galway
      HTTP/1.1 303 See Other
      Content-Type: application/rdf+xml
      Location: http://dbpedia.org/data/Galway.rdf
    • 55. HTTP
      Caching (see Cache–Control header field) is essential for scalabilityhttp://webofdata.wordpress.com/2009/11/23/linked-open-data-http-caching/
      HTTPbis IETF WG chaired by Mark Nottingham, mainly about: patches, clarifications, deprecate non-used features, documentation of security properties
    • 56. REST - HTTP
      Representational State Transfer (REST)
      resource intended conceptual target of a hypertext reference
      resource identifier URL, URN
      representation HTML document, JPEG image
      representation media type, last-modified time
      metadata
      resource source link, alternates, varymetadata
      control data if-modified-since, cache-control
      http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htmhttp://webofdata.wordpress.com/2009/10/09/linked-data-for-restafarians/
    • 57. Web's Standard Retrieval Algorithm
      parse URI and find HTTP protocol
      look up DNS name to determine the associated IP address
      open a TCP stream to port 80 at the IP address determined above
      format an HTTP GET request for resource and sends that to the server
      read response from the server
      from the status code (200) determine that a representation of the resource is available
      inspect the returned Content-Type
      pass the entity-body to its HTML rendering engine
      http://www.w3.org/2001/tag/doc/selfDescribingDocuments
    • 58. RDF
      A data model - directed, labeled graph
      Triple: (subject predicate object)
      subject … URIref or bNode
      predicate … URIref
      object … URIref or bNode or literal
    • 59. RDF Triple
      • Inspired by linguistic categories
      • 60. Allowed usage:Subject: URI or blank nodePredicate: URI (also called properties)Object : URI or blank nodes or literal
      isMayorOf
      Leipzig
      Burkhard Jung
      Predicate
      Object
      Subject
    • 61. Example RDF Graph
      Leipzig
      hasAreaCode
      longitude
      0341
      12.3833
      latitude
      locatedIn
      hasMayor
      51.3333
      isMayorOf
      Saxony
      Burkhard Jung
      born
      locatedIn
      1958-03-07
      isMemberOf
      Germany
      Social Democratic Party
    • 62. Literals
      • Representation of data values
      • 63. Serialization as strings
      • 64. Interpretation based on the datatype
      • 65. Literals without Datatype are treated as strings
      longitude
      12.3833
      Leipzig
      latitude
      51.3333
      isMayorOf
      hasMayor
      born
      1958-03-07
      Burkhard Jung
    • 66. RDF Serialization
      N3: "Notation 3" - extensive formalism
      N-Triples: part of N3
      Quelle:http://www.w3.org/DesignIssues/Notation3.html
      Turtle: Extension of N-Triples (shortcuts)
    • 67. Turtle Syntax
      • URIs in angle brackets
      • 68. Literals in quotes
      • 69. Triples separated by dot
      • 70. Whitespace is ignored
      33
    • 71. Turtle Syntax: Shortcuts
      http://dbpedia.org/resource/Leipzig http://dbpedia.org/property/hasMayor http://dbpedia.org/resource/Burkhard_Jung ;
      http://www.w3.org/2000/01/rdf-schema#label "Leipzig"@de ;
      http://www.w3.org/2003/01/geo/wgs84_pos#lat "51.333332"^^xsd:float ;
      http://www.w3.org/2003/01/geo/wgs84_pos#lon "12.383333"^^xsd:float .
      Shortcuts for namespace prefixes:
      @prefixrdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
      @prefixrdfs:<http://www.w3.org/2000/01/rdf-schema#> .
      @prefixdbp:<http://dbpedia.org/resource/> .
      @prefixdbpp:<http://dbpedia.org/property/> .
      @prefixgeo:<http://www.w3.org/2003/01/geo/wgs84_pos#> .
      dbp:Leipzigdbpp:hasMayordbp:Burkhard_Jung .
      dbp:Leipzigrdfs:label "Leipzig"@de .
      dbp:Leipziggeo:lat "51.333332"^^xsd:float .
      dbp:Leipziggeo:lon "12.383333"^^xsd:float .
    • 72. Turtle Syntax: Shortcuts
      Group triples with same subject using “;” instead of “.”:
      @prefixrdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
      @prefixrdfs="http://www.w3.org/2000/01/rdf-schema#> .
      @prefixdbp="http://dbpedia.org/resource/> .
      @prefixdbpp="http://dbpedia.org/property/> .
      @prefixgeo="http://www.w3.org/2003/01/geo/wgs84_pos#> .
      dbp:Leipzigdbpp:hasMayordbp:Burkhard_Jung ;
      rdfs:label "Leipzig"@de ;
      geo:lat "51.333332"^^xsd:float ;
      geo:lon "12.383333"^^xsd:float .
      also Triple with same subject and predicate:
      @prefixdbp="http://dbpedia.org/resource/> .
      @prefixdbpp="http://dbpedia.org/property/> .
      dbp:Leipzigdbp:locatedIndbp:Saxony, dbp:Germany;
      dbpp:hasMayordbp:Burkhard_Jung .
    • 73. XML-Syntax von RDF
      • Turtle intuitively readable and machine processable
      • 74. but: better tool support and programming libraries for XML
      <?xmlversion="1.0"?>
      <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:dbpp="http://dbpedia.org/property/"
      xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
      <rdf:Descriptionrdf:about="http://dbpedia.org/resource/Leipzig">
      <property:hasMayor
      rdf:resource="http://dbpedia.org/resource/Burkhard_Jung" />
      <rdfs:labelxml:lang="de">Leipzig</rdfs:label>
      <geo:latrdf:datatype="float">51.3333</geo:lat>
      <geo:lonrdf:datatype="float">12.3833</geo:lon>
      </rdf:Description>
      </rdf:RDF>
    • 75. RDF/JSON
      • JSON = JavaScript Object Notation
      • 76. Compact formatfordataexchangebetweenapplications
      • 77. JSON documentsare valid JavaScript
      • 78. Programminglanguageindependent, sinceparserexistfor all popularprogramminglanguages
      • 79. Lessoverheadwhenparsingandserialisingthan XML
      { "S" : { "P" : [ O ] } }
      • Subject: URI, BNode
      • 80. Predicate: URI
      • 81. Object:
      Type: „URI“, „Literal“ or „bnode“
      Value:datavalue
      Lang: language tag
      Datatype: URI ofthedatatype.
    • 82. JSON Example
      {
      "http://dbpedia.org/resource/Leipzig" : {
      "http://dbpedia.org/property/hasMayor":
      [ { "type":"uri", "value":"http://dbpedia.org/resource/Burkhard_Jung" } ],
      "http://www.w3.org/2000/01/rdf-schema#label":
      [ { "type":"literal", "value":"Leipzig", "lang":"en" } ] ,
      "http://www.w3.org/2003/01/geo/wgs84_pos#lat":
      [ { "type":"literal", "value":"51.3333",
      "datatype":"http://www.w3.org/2001/XMLSchema#float" } ]
      "http://www.w3.org/2003/01/geo/wgs84_pos#lon":
      % [ { "type":"literal", "value":"12.3833",
      "datatype":"http://www.w3.org/2001/XMLSchema#float" } ]
      }
      }
    • 83. RDFa Syntax
      • RDFa = Resource Description Framework – in –attributes
      • 84. Embedding RDF in XHTML
      • 85. UTF-8 and UTF-16, since Extension of XML based XHTML
      • 86. Due toembedding in HTML moreoverheadthanotherserialisations
      • 87. Lessreadable
      <?xmlversion="1.0" encoding="UTF-8"?>
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
      "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
      <htmlversion="XHTML+RDFa 1.0" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:dbpp="http://dbpedia.org/property/"
      xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
      <head><title>Leipzig</title></head>
      <bodyabout="http://dbpedia.org/resource/Leipzig">
      <h1 property="rdfs:label" xml:lang="de">Leipzig</h1>
      <p>Leipzig is a city in Germany. Leipzig'smayoris
      <a href="Burkhard_Jung" rel="dbpp:hasMayor">Burkhard Jung</a>. Itislocated
      atlatitude <span property="geo:lat" datatype="xsd:float">51.3333</span>
      andlongitude <span property="geo:lon" datatype="xsd:float">12.3833</span>.</p>
      </body>
      </html>
    • 88. Vocabularies
      Schema layer of RDF
      Defines terms (classes and properties)
      Typically RDFS or OWL family
      Common vocabularies:
      Dublin Core, SKOS
      FOAF, SIOC, vCard
      DOAP
      Core Organization Ontology
      VoID
      http://www.slideshare.net/prototypo/introduction-to-linked-data-rdf-vocabularies
    • 89. 41
      Vokabulare: Friend-of-a-Friend (FOAF)
      defines classes and properties for representinginformation about people and theirrelationships
      Soerenrdf:typefoaf:Person .
      SoerencurrentProject http://OntoWiki.net .
      Soerenfoaf:homepage http://aksw.org/Soeren .
      Soerenfoaf:knows http://sembase.at/Tassilo .
      Soeren foaf:sha1 09ac456515dee .
    • 90. 42
      Vokabulare: SemanticallyInterlinked Online Communities.
      Represent content from Blogs, Wikis, Forums, Mailinglists, Chats etc.
    • 91. 43
      Vokabulare: Simple Knowledge Organization System (SKOS)
      support the use of thesauri, classification schemes, subjectheading systems and taxonomies
    • 92. Instance data
      Instancesareassociatedwithoneorseveralclasses:
      Boddingtonsrdf:type Ale .
      Grafentrunkrdf:type Bock .
      Hoegaardenrdf:type White .
      Jeverrdf:type Pilsner .
    • 93. The Linked Open Data cloud
    • 94. Linked Open Data cloud
      2009
      2007
      2008
      2008
      2010
      2008
      2008
      2009
      46
    • 95. Linked Open Data cloud
      Media
      User-generated
      Government
      Publications
      Cross-domain
      Geo
      Life sciences
      http://lod-cloud.net/
    • 96. LOD cloudstats
      triples distribution
      links distribution
      http://lod-cloud.net/state/
    • 97. TimBL’s 5-star plan for open data
      ★ Make your data available on the Web under an open license
      ★★ Make it available as structured data(Excel sheet instead of image scan of a table)
      ★★★ Use a non-proprietary format(CSV file instead of an Excel sheet)
      ★★★★ Use Linked Data format(URIs to identify things, RDF to represent data)
      ★★★★★ Link your data to other people’s data to provide context
      More: http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
    • 98. Why going for the 5th star?
      Central Contractor Registration (CCR)
      Geonames
      http://webofdata.wordpress.com/2011/05/22/why-we-link/
    • 99. Effort distribution
      Fix Overall Data IntegrationEffort
    • 100. Datasets
      A dataset is a set of RDF triples that are published,
      maintained or aggregated by a single provider
    • 101. Linksets
      An RDF link is an RDF triple whose subject and object are described in different datasets
      A linksetis a collection of such RDF links between two datasets
    • 102. Describing Datasets - VoID
      General dataset metadata
      Access metadata
      Structural metadata
      Describing linksets
      Deployment and discovery of voiD files
      http://www.w3.org/TR/void/
    • 103. General dataset metadata
      Dataset homepage
      Publisher
      Title and description
      Categorisation
      Licensing
      Technical features
    • 104. General dataset metadata
      :DBpedia a void:Dataset ;
      dcterms:title "DBpedia” ;
      dcterms:description "RDF data extracted from Wikipedia” ;
      dcterms:contributor :FU_Berlin ;
      dcterms:contributor :Uni_Leipzig ;
      dcterms:contributor :Openlink ;
      dcterms:source <http://dbpedia.org/resource/Wikipedia> ;
      void:feature <http://www.w3.org/ns/formats/RDF_XML> ;
      dcterms:modified "2008-11-17"^^xsd:date .
      :Geonames a void:Dataset ;
      dcterms:subject <http://dbpedia.org/resource/Location> .
      :GeoSpecies a void:Dataset ;
      dcterms:license <http://creativecommons.org/licenses/by-sa/3.0/us/> .
    • 105. Access metadata
      SPARQL endpoints
      RDF data dumps
      Root resources
      URI lookup endpoints
      OpenSearch description documents
    • 106. Access metadata
      :exampleDSvoid:Dataset ;
      void:sparqlEndpoint <http://example.org/sparql> ;
      void:dataDump <http://example.org/dump1.rdf> ;
      void:uriLookupEndpoint <http://api.example.org/search?qt=term> .
    • 107. Structural metadata
      Provides high-level information about the schema and internal structure of a dataset and can be helpful when exploring or querying datasets:
      Example resources
      Patterns for resource URIs
      Vocabularies
      Dataset partitions
      Statistics
    • 108. Structural metadata
      :DBpedia a void:Dataset;
      void:exampleResource <http://dbpedia.org/resource/Berlin> .
      :LiveJournal a void:Dataset;
      void:vocabulary <http://xmlns.com/foaf/0.1/> .
      :DBpedia a void:Dataset;
      void:classPartition [
      void:classfoaf:Person;
      void:entities 312000;
      ];
      void:propertyPartition [
      void:propertyfoaf:name;
      void:triples 312000;
      ];
      .
    • 109. Describing linksets
    • 110. Describing linksets
      :DBpedia a void:Dataset ;
      void:subset :DBpedia2Geonames .
      :Geonames a void:Dataset .
      :DBpedia2Geonames a void:Linkset ;
      void:target :DBpedia ;
      void:target :Geonames ;
      void:linkPredicateowl:sameAs .
    • 111. Deployment and discovery
      Choosing URIs for datasets
      Publishing a VoID file alongside a dataset
      Turtle
      RDFa
      SPARQL Service Description Vocabularyhttp://www.w3.org/TR/sparql11-service-description/
      Discovery (well-known URI), based on of RFC5758], registered with IANAhttp://www.example.com/.well-known/void
    • 112. Consumption - Essentials
      Linked Data provides for a global data-space with a uniform API (due to RDF as the data model)
      Access methods
      Dereference URIs via HTTP GET (RDF/XML, RDFa, etc.)
      SPARQL (‘the SQL of RDF’)
      Data dumps (RDF/XML, etc.)
    • 113. Consumption - Technologies
      Linked Data access mechanisms widely supported
      all major platforms and languages (HTTP interface & RDF parsing), such as Java, Python, PHP, C/C++/.NET, etc.
      Command line tools (curl, rapper, etc.)
      Online tools
      http://redbot.org/ (HTTP/low-level)
      http://sindice.com/developers/inspector (RDF/data-level)
      Structured query: SPARQL (more later)
    • 114. Consumption - Technologies
      Distributed setup  need for central point of access (indexer, aggregator)
      Sindice, an index of the Web of Data
      http://sindice.com/
      Sig.ma, Web of Data aggregator & browser
      http://sig.ma/
      Relationship discovery
      http://relfinder.semanticweb.org/
    • 115. Technologies – FYN
      http://dbpedia.org/resource/Galway
      67
    • 116. Technologies – Sig.ma
      Sig.ma is a Web of Data platform enabling entity visualisation and consolidation both for humans and machines (API)
      http://sig.ma/search?q=Galway
      68
    • 117. Technologies – sameas.org
      Sameas.org is a service to find co-references on the Web of Data
      http://sameas.org/html?q=Galway
    • 118.
      • All Linked Data datasets share a uniform data model, the RDF statement data model
      • 119. Information is represented in facts expressed as (subject, predicate, object) triples
      • 120. Components: globally unique IRI/URI entity identifiers & typed data values (literals) as objects
      Linked Data Benefits: Uniformity
    • 121.
      • URIs not just used for identifying entities, but also (as URLs) for locating and retrieving resources that describe these entities on the Web
      Linked Data Benefits: De-referencability
    • 122. Linked Data Benefits: Coherence
      Berlin
      Germany
      isCapitalOf
      Knowledgebase 2
      isMemberOf
      Knowledgebase 1
      European Union
      • triples containing URIs from different namespaces as subject and object, establish a link between (the entity identified by the) subject with (the entity identified by the) object (typed RDF links)
      • RDF data model, is based on a single mechanism for representing information (triples) -> very easy to attain a syntactic and simple semantic integration of different Linked Data sets.
      • 123. higher level semantic integration can be achieved by employing schema and instance matching techniques and expressing found matches again as additional triple facts
      Linked Data Benefits: Integrateability
    • 124.
      • Publishing and updating Linked Data is relatively simple thus facilitating a timely availability
      • 125. once a Linked Data source is updated it is straightforward to access and use the updated data source (time consuming and error prune extraction, transformation and loading not required)
      Linked Data Benefits: Timeliness
    • 126. The Vision & Big Picture
      Linked Data 101
      The LinkedData Life-cycle
      Agenda
    • 127. What works now? What has to be done?
      • Web - a global, distributed platform for data, information and knowledge integration
      • 128. exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF
      Achievements
      Extension of the Web with a data commons (25B facts
      vibrant, global RTD community
      Industrial uptake begins (e.g. BBC, Thomson Reuters, Eli Lilly)
      Emerging governmental adoption in sight
      Establishing Linked Data as a deployment path for the Semantic Web.
      • Challenges
      Coherence: Relatively few, expensively maintained links
      Quality: partly low quality data and inconsistencies
      Performance: Still substantial penalties comparedto relational
      Data consumption: large-scale processing, schema mapping and data fusion still in its infancy
      Usability: Establishing direct end-user tools and network effect
      April 2008
      July 2007
      September 2008
      July 2009
    • 129. Linked DataLifecycle
    • 130. Extraction
    • 131. From unstructured sources
      • NLP, text mining, annotation
      From semi-structured sources
      • DBpedia, LinkedGeoData, SCOVO/DataCube
      From structured sources
      • RDB2RDF
      Extraction
    • 132. extract structured information from Wikipedia & make this information available on the Web as LOD:
      • ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players),
      • 133. link other data sets on the Web to Wikipedia data
      • 134. Represents a community consensus
      Recently launched DBpedia Live transforms Wikipedia into a structured knowledge base
      Transforming Wikipedia into an Knowledge Base
    • 135. Structure in Wikipedia
      Title
      Abstract
      Infoboxes
      Geo-coordinates
      Categories
      Images
      Links
      other language versions
      other Wikipedia pages
      To the Web
      Redirects
      Disambiguations
    • 136. Infobox templates
      Wikitext-Syntax
      {{Infobox Korean settlement
      | title = Busan Metropolitan City
      | img = Busan.jpg
      | imgcaption = A view of the [[Geumjeong]] district in Busan
      | hangul = 부산 광역시
      ...
      | area_km2 = 763.46
      | pop = 3635389
      | popyear = 2006
      | mayor = Hur Nam-sik
      | divs = 15 wards (Gu), 1 county (Gun)
      | region = [[Yeongnam]]
      | dialect = [[Gyeongsang]]
      }}
      http://dbpedia.org/resource/Busan
      dbp:Busan dbpp:title ″Busan Metropolitan City″
      dbp:Busan dbpp:hangul ″부산 광역시″@Hang
      dbp:Busan dbpp:area_km2 ″763.46“^xsd:float
      dbp:Busan dbpp:pop ″3635389“^xsd:int
      dbp:Busan dbpp:region dbp:Yeongnam
      dbp:Busan dbpp:dialect dbp:Gyeongsang
      ...
      RDF representation
    • 137. A vast multi-lingual, multi-domain knowledge base
      DBpedia extraction results in:
      descriptions of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases
      labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages;4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories
      altogether over 1 billion pieces of information (i.e. RDF triples): 257M from English edition, 766M from other language editions
      DBpediaLive (http://live.dbpedia.org/sparql/) &Mappings Wiki (http://mappings.dbpedia.org)integrate the community into a refinement cycle
      UpcommingDBpedia inline
    • 138. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      84
      DBpedia Architecture
      Extraction Manager
      Wikipedia
      N-Triple
      Dumps
      Extraction Job
      Destinations
      Article-
      Queue
      Extractors
      Update
      Stream
      N-Triple
      Serializer
      Image
      Label
      Category
      PageCollections
      SPARQL-
      Update
      Destination
      Redirect
      Disambiguation
      Wikipedia
      Dumps
      Database
      Wikipedia
      Pagelink
      Geo
      Abstract
      Generic Infobox
      Triple Store
      Virtuoso
      Wikipedia
      OAI-PMH
      Live
      Wikipedia
      Mapping-based Infobox
      Parsers
      Ontology-
      Mappings
      DateTime
      Units
      Geo
      String-List
      Numbers
      Linked Data
      SPARQL endpoint
      The Web
      SPARQL clients
      HTML browser
      DBpedia apps
      RDF browser
    • 139. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      85
      Hierarchies
      DBpedia Ontology Schema:
      manually created for DBpedia (infoboxes)
      275 classes + 1335 properties; 20mio triples
      YAGO:
      large hierarchy linking Wikipedia leaf categories to WordNet
      250,000 classes
      UMBEL (Upper Mapping and Binding Exchange Layer):
      20000 classes derived from OpenCyc
      Wikipedia Categories:
      Not a class hierarchy (e.g. cycles), represented using SKOS
      415,000+ categories
    • 140. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      86
      DBpedia SPARQL Endpoint
      http://dbpedia.org/sparql
      hosted on a OpenLink Virtuososerver
      can answer SPARQL queries like
      Give me all Sitcoms that are set in NYC?
      All tennis players from Moscow?
      All films by Quentin Tarentino?
      All German musicians that were born in Berlin in the 19th century?
      All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants?
    • 141. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      87
      DBpedia SPARQL Endpoint
      SELECT ?name ?birth ?description ?person WHERE {
      ?person dbp:birthPlace dbp:Berlin .
      ?person skos:subject dbp:Cat:German_musicians .
      ?person dbp:birth ?birth .
      ?person foaf:name ?name .
      ?person rdfs:comment ?description .
      FILTER (LANG(?description) = 'en') .
      } ORDER BY ?name
    • 142. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      88
      DBpedia Applications
      DBpedia Mobile: location aware mobile client for DBpedia
      Uses current location and DBpedia to display map
      Can navigate into other knowledge bases
      DBpedia Query Builder: user front end for building queries
      DBpedia Relationship Finder finds relation between two objects in DBpedia
    • 143. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      89
      DBpedia Applications
    • 144. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      90
      DBpediaApplications: Relfinder
      http://www.visualdataweb.org/relfinder.php
    • 145. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      91
      DBpediaApplications: Zemanta
    • 146. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      92
      DBpediaApplications: Faceted-Browser
    • 147. 2011/05/12
      CONSEGI - Sören Auer: DBpedia
      93
      DBpedia Applications (3rd party)
      Muddy Boots (BBC): Annotate actors in BBC News with DBpedia identifiers
      Open Calais (Reuters): named entity recognition; entities are connected via owl:sameAs to DBpedia, Freebase, Geonames
      Faviki: Social Bookmarking Tool uses DBpedia in backend to group tags etc. and multi-language support
      Topbraid Composer: ontology editor, which links entities to DBpedia based on their labels
    • 148. LinkedGeoData
      Conversion, interlinking and publishing of OpenStreetMap.org* data sets as RDF.
      * ”Wikipedia for geographic data”
    • 149. Motivation
      Ease information integration tasks that require spatial knowledge, such as
      • Offerings of bakeries next door
      • 150. Map of distributed branches of a company
      • 151. Historical sights along a bicycle track
      Therefore use RDF/OWL in order overcome structural and semantic heterogeneity.
      • Requires a vocabulary – which we try to establish.
      LOD cloud contains data sets with spatial features
      • e.g. Geonames, DBpedia, US census, EuroStat
      • 152. But: they are restricted to popular or large entities like countries, famous places etc.
      Therefore they lack buildings, roads, mailboxes, etc.
    • 153. OpenStreetMap - Datamodel
      Basic entities are:
      NodesLatitude, Longitude
      WaysSequence of nodes
      RelationsAssociations between any number of nodes, ways and relations.
      Each entity may be described with tags (= key-value pairs)
    • 154. Example: Leipzig's zoo
    • 155. Data/Mapping Example
    • 156. Data/Mapping Example
    • 157. Mapping Types
      Three Mapping Types
      Text
      (5, name, Leipzig) -> lgd:node5 rdfs:label ”Leipzig”
      (5, name:de, Leipzig) -> lgd:node5 rdfs:label ”Leipzig”@de
      Datatypes
      (6, seats, 4) -> lgd:node6 lgdo:seats ”4”^^xsd:integer
      Classes/Object Properties
      (7, place, city) -> lgdn:7 a lgdo:City
      (7, religion, pastafarian) -> lgdn:7 lgdo:religion lgdo:Pastafarian
    • 158. Access
      Rest Interface (based on Postgis DB, full osm dataset loaded, > 1billion triples)
      Supports limited queries (e.g. circular/rectangular area, filtering by labels)
      Sparql Endpoints (based on Virtuoso DB, subset of osm dataset, ~222M triples)
      Static (http://linkedgeodata.org/sparql)
      Live (http://live.linkedgeodata.org/sparql)
      Downloads (http://downloads.linkedgeodata.org)
      Monthly updates on the above datasets envisioned
    • 159. LinkedGeoData Live
      OpenStreetMap provides full dumps and minutely changesets for download
      Changesets are numbered, e.g. ”001/234/567.osc.gz”
      We also convert the changesets to sets of added and removed triples (relative to our store) and publish them
      001/234/567.added.nt.gz
      001/234/567.removed.nt.gz
      Advantage: Other users could easily sync their RDF store with LinkedGeoData
    • 160. DBpedia Mapping – Step By Step
      Given a DBpedia point, query LGD points within type specific maximum distance
      Basic idea (performed with Silk):
      Compute spatial score
      Compute name similarity (rdfs:label)
    • 161. DBpedia Mapping – Step By Step
      Given a DBpedia point, query LGD points within type specific maximum distance
      Basic idea (performed with Silk):
      Compute spatial score
      Compute name similarity (rdfs:label)
      Combine both scores
      Depending on final score, either automatically accept/reject links or mark for manual verification.
    • 162. Statistics (2011-Feb-23)
      222.539.712 Triples
      6.666.865 Ways
      5.882.306 Nodes
      Among them
      352.673 PlaceOfWorship
      60.573 RailwayStation
      59.468 Recycling
      50.955 Town
      30.099 Toilet
      7.222 City
    • 163. Conclusion
      OpenStreetMap
      immensely successful project for collaboratively creating free spatial data
      Community uses key value structures, which provide a rich source of information
      Key strength: broad coverage
      LGD Contributions
      Established mapping to Dbpedia
      Geonames mapping partially done (37 different entity types cities, churches, ...)
      Facet-based LGD Browser provides an interface for OSM/LGD, which highlights its structural aspects
      Live sync
      Goal: Make LGD as useful (succesful) as DBpedia for the geospatial domain
    • 164. Many different approaches (D2R, Virtuoso RDF Views, Triplify, …)
      No agreement on a formalsemantics of RDF2RDFmapping
      • LOD readiness, SPARQL-SQL translation
      W3C RDB2RDF WG
      Extraction Relational Data
    • 165. From unstructured sources
      • Deploy existing NLP approaches (OpenCalais, Ontos API)
      • 166. Develop standardized, LOD enabled interfaces between NLP tools (NLP2RDF)
      From semi-structured sources
      • Efficient bi-directional synchronization
      From structured sources
      • Declarative syntax and semantics of data model transformations (W3C WG RDB2RDF)
      Orthogonal challenges
      • Using LOD as background knowledge
      • 167. Provenance
      Extraction Challenges
    • 168. Storage and Querying
    • 169. Still by a factor 5-50 slower than relational data management (BSBM, DBpedia Benchmark)
      Performance increases steadily
      Comprehensive, well-supported open-soure and commercial implementations are available:
      • OpenLink’s Virtuoso (os+commercial)
      • 170. Big OWLIM (commercial), Swift OWLIM (os)
      • 171. 4store (os)
      • 172. Talis (hosted)
      • 173. Bigdata (distributed)
      • 174. Allegrograph (commercial)
      • 175. Mulgara (os)
      RDF Data Management
    • 176.
      • UsesDBpediaasdataand a selectionof 25 frequentlyexecutedqueries
      • 177. Can generatefractionsand multiples ofDBpedia‘ssize
      • 178. Does not resemble relational data
      Performance differences, observedwithotherbenchmarksareamplified
      DBpedia Benchmark
      GeometricMean
    • 179.
      • Reduce the performance gap between relational and RDF data management
      • 180. SPARQL Query extensions
      • 181. Spatial/semantic/temporal data management
      • 182. More advanced query result caching
      • 183. View maintenance / adaptive reorganization based on common access patterns
      • 184. More realistic benchmarks
      Storage and Querying Challenges
    • 185. Authoring
    • 186. 1. Semantic (Text) Wikis
      • Authoring of semanticallyannotated texts
      2. Semantic Data Wikis
      • Direct authoring of structured information (i.e. RDF, RDF-Schema, OWL)
      Two Kinds of Semantic Wikis
    • 187. Versatile domain-independent tool
      Serves as Linked Data / SPARQL endpoint on the Data Web
      Open-source project hosted at Google code
      Not just a Wiki UI, but a whole framework for the development of Semantic Web applications
      Developed in PHP based on the Zend framework
      Very active developer and user community
      More than 500 downloads monthly
      Large number of use cases
      OntoWiki – a semantic data wiki
    • 188. Dynamic views on knowledgebases
      OntoWiki
    • 189. OntoWiki
      RDF triples on resourcedetailspage
    • 190. OntoWiki
      Dynamische Vorschläge aus dem Daten Web
    • 191. CatalogusProfessorumLipsiensis
    • 192. OntoWiki: Caucasian Spiders
    • 193. RDFauthor in OntoWiki
    • 194. Semantic Portal with OntoWiki: Vakantieland
    • 195. RDFaCE- RDFa Content Editor
    • 196. RDFaCEArchitecture
    • 197. Integratingvarious NLP APIs
    • 198. Linking
      © CC-BY-NC-ND by ~Dezz~ (residae on flickr)
    • 199. Automatic
      Semi-automatic
      Manual
      • Sindice integration into UIs
      • 201. Semantic Pingback
      LOD Linking
    • 202. LIMES 0.3: Basic Idea
      Usesthecharacteristicsofmetricspaces
      Especiallyconsequencesoftriangleinequality
      d(x, y) < d(x, z) + d(z, y)
      d(x, z) - d(z, y) < d(x, y) < d(x, z) + d(z, y)
      Basic idea
      Usepessimisticapproximationsofdistancesinsteadofcomputingthem
      Onlycomputedistanceswhenneeded
    • 203. Overview
      Knowledgesources
    • 204. Computationof Exemplars
      Assumption: numberofexemplarsisgiven
      Goal: Segment targetdataset
    • 205. Computationof Exemplars
    • 206. Computationof Exemplars
    • 207. Computationof Exemplars
    • 208. Computationof Exemplars
    • 209. Computationof Exemplars
      NB: Distancesfromexemplarsto all otherpointsareknown
    • 210. Filtering
      y
      x
      z
      1. Measuredistancefromeach x toeachexemplar
    • 211. Filtering
      y
      x
      z
      2. Applyd(x, y) - d(y, z) > t d(x, z) > t
    • 212. SimilarityComputation
      y
      x
      z
      d(x, y) - d(y, z) <t Compute d(x, z)
    • 213. Serialization
      Resultsarereturnedas RDF
      ForexamplemappingDBpediaandDrugbank
      @prefixdrugbank: <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/> .
      @prefixdbpedia: <http://dbpedia.org/ontology/> .
      @prefixowl: <http://www.w3.org/2002/07/owl#> .
      dbpedia:Cefaclorowl:sameAs drugbank:DB00833 .
      dbpedia:Clortermineowl:sameAs drugbank:DB01527 .
      dbpedia:Prednicarbateowl:sameAs drugbank:DB01130 .
      dbpedia:Linezolidowl:sameAs drugbank:DB00601 .
      dbpedia:Valaciclovirowl:sameAs drugbank:DB00577 .
      ….
    • 214. Experiments
      Q1: Whatisthebestnumberofexemplars?
      Q2: Whatistherelationbetweenthesimilaritythresholdqandthe total numberofcomparisons?
      Q3: Doestheassignmentof S and T matter?
      Q4:Howdoes LIMES compareto SILK?
    • 215. Q1 and Q2
      Experiments on syntheticdata
      Knowledgebasesofsizes 2000, 3000, 5000, 7500 and 10000
      Variednumberofexemplars
      Variedthresholds
      Experiments wererepeated 5 times
      Averageresultsarepresented
    • 216. Q1 and Q2
    • 217. Q1 and Q2
      Q1
      Best numberofexemplarsdepends on q
      Forq> 0.9, bestnumber lies around |T|1/2
      Q2
      As expected, numberofcomparisonsdiminisheswithgrowingq
    • 218. Q3 (order of S and T)
      Experiments on syntheticdata
      Knowledgebasesofsizes 1000, 2000, 3000, …, 10000
      Numberofexemplars was |T|1/2
      Experiments wererepeated 5 times
      Averageresultsarepresented
    • 219. Q3
    • 220. Q3
      • Green = S firstismore time-efficient
      • 221. Overall lessthan 5% difference
    • Q4 (comparisonwith SILK)
      3 Experiments on real data
      Drugs
      Diseases
      SimCities
      Numberofexemplars was |T|1/2
      Comparisonofruntimewith SILK
      Experiments wererepeatedthrice
      Best runtimesarepresented
    • 222. Q4
    • 223. Q4
      Weoutperform SILK 2 by 1.5 ordersofmagnitude
      The larger thedatasources, thehigherourspeedup (64 forSimCities)
    • 224. LIMES Future Work
      Extension ofapproachto semi-metrics
      Adaption ofexemplarcomputationtoskew in data
      Automatic computationoftheaccuratenumberofexemplars
      ParallelizationusingHadoopshowshighspeedupgain
      Combinationwithactivelearning
    • 225. update and notification services for LOD
      Downward compatible with Pingback (blogosphere)
      http://aksw.org/Projects/SemanticPingBack
      Creating a network effect aroundLinking Data: Semantic Pingback
    • 226. Visualizing Pingbacks in OntoWiki
    • 227. Only 5% of the information on the Data Web is actually linked
      • Make sense of work in the de-duplication/record linkage literature
      • 228. Consider the open world nature of Linked Data
      • 229. Use LOD background knowledge
      • 230. Zero-configuration linking
      • 231. Explore active learning approaches, which integrate users in a feedback loop
      • 232. Maintain a 24/7 linking service: Linked Open Data Around-The-Clock project (LATC-project.eu)
      Interlinking Challenges
    • 233. Enrichment
    • 234. Linked Data is mainly instance data and !!!
      ORE (Ontology Repair and Enrichment) tool allows to improve an OWL ontology by fixing inconsistencies & making suggestions for adding further axioms.
      • Ontology Debugging: OWL reasoning to detect inconsistencies and satisfiable classes + detect the most likely sources for the problems. user can create a repair plan, while maintaining full control.
      • 235. Ontology Enrichment: uses the DL-Learner framework to suggest definitions & super classes for existing classes in the KB. works if instance data is available for harmonising schema and data.
      http://aksw.org/Projects/ORE
      Enrichment & Repair
    • 236. Quality
      Analysis
    • 237. Quality on the Data Web is varying a lot
      • Hand crafted or expensively curated knowledge base (e.g. DBLP, UMLS) vs. extracted from text or Web 2.0 sources (DBpedia)
      Research Challenge
      • Establish measures for assessing the authority, provenance, reliability of Data Web resources
      Linked Data Quality Analysis
    • 238. Evolution
      © CC-BY-SA by alasis on flickr)
    • 239.
      • unified method, for both data evolution and ontology refactoring.
      • 240. modularized, declarative definition of evolution patterns is relatively simple compared to an imperative description of evolution
      • 241. allows domain experts and knowledge engineers to amend the ontology structure and modify data with just a few clicks
      • 242. Combined with RDF representation of evolution patterns and their exposure on the Linked Data Web, EvoPat facilitates the development of an evolution pattern ecosystem
      • 243. patterns can be shared and reused on the Data Web.
      • 244. declarative definition of bad smells and corresponding evolution patterns promotes the (semi-)automatic improvement of information quality.
      EvoPat – Pattern based KB Evolution
    • 245. Evolution Patterns
    • 246.
    • 247. Exploration
    • 248. An ecosystemof LOD visualizations
      Statistical
      visualization
      Faceted-browsing
      Spatialfaceted-browsing
      Entity-/faceted-
      Basedbrowsing
      Domain specificvisualizations
      LOD Exploration
      Widgets


      • Dataset analysis (size, vocabularies, propertyhistograms etc.)
      • 249. Selectionofsuitablevisualizationwidgets
      Choreographylayer
      LOD Datasets
    • 250. TODO: Put ULEI slides
      Faceted spatial-semantic browsing component
    • 251. Pure JavaScript, requires onlySPARQL Endpoint for data access, Cross-Origin Resource Sharing (CORS) enabled.
      operates on local spatial regions, doednot depend on global meta-data about the data
      Source code:
      • https://github.com/AKSW/SpatialSemanticBrowsingWidgets
      Online Demo - LinkedGeoData Browser:
      • http://browser.linkedgeodata.org
      Next steps
      • Polygone/curvemarkers, domainspecificvisualizationtemplates, integrationofothersources, mobile interface
      Publication:
      • Claus Stadler, Jens Lehmann, Konrad Höffner, Sören Auer: LinkedGeoData: A Core for a Web of Spatial Open Data. Toappear in Semantic Web Journal - Special Issue on Linked Spatiotemporal Data and Geo-Ontologies.
      Faceted spatial-semantic browsing - Availability
    • 252. Genericentity-basedexplorationwithOntoWiki
      http://fintrans.publicdata.eu
    • 253. Domain-specificvisualization:http://energy.publicdata.eu
    • 254. Visualization ofstatisticdata (datacubevocab.)
      http://scoreboard.lod2.eu
    • 255.
    • 256. 05.10.2011
      Sören Auer - The emerging Web of Linked Data
      170
    • 257. 05.10.2011
      Sören Auer - The emerging Web of Linked Data
      171
    • 258.
    • 259. Visual Query Builder
    • 260. Relationship Finder in CPL
    • 261. Distributed Social Semantic Networking
    • 262. DistributedSocialSemanticNetworking
      Social Networks are walled gardens
      • Take users' data out of their hands,
      • 263. predefined privacy & data security regulations
      • 264. infrastructure of a single provider (lock-in)
      • 265. Facebook (600M+ users) = Web inside the Web
      • 266. Interoperability is limited to proprietary APIs
      Social networks should be open and evolving
      • allow users to control what to enter & keep control over their data
      • 267. users should be able to host the data on infrastructure, which is under their direct control, the same way as they host their own website (TBL)
      We need a truly Distributed Social Semantic Network (DSSN)
      • Initial approaches appeared with GNU social and more recently Diaspora
      • 268. a DSSN should be based on semantic resource descriptions and de-referenceabilityso as to ensure versatility, reusability and openness in order to accommodate unforeseen usage scenarios
      • 269. a number of standards and best-practices for social, Semantic Web applications such as FOAF, WebID and Semantic Pingback emerged.
    • Resources announce services and feeds, feeds announce services – in particular a push service.
      Applications initiate ping requests to spin the Linked Data network
      Applications subscribe to feeds on push services and receive instant notifications on updates.
      Update services are able to modify resources and feeds (e.g. on request of an application)
      Personal and global search services index social network resources and are used by applications
      Access to resources & services can be delegated to applications by a WebID, i.e. application can act in name of WebID owner
      The majority of all access operations is executed through standard web requests.
      DSSN Architecture
    • 270.
      • Open-source, MVC architecture
      • 271. Plattform independent, based on HTML5, CSS, Javascript
      • 272. jQuery, jQuery Mobile, jQueryUI
      • 273. rdfQuery – simple triplestore in Javascript
      • 274. PhoneGap (Apache Device ready) native appsforiOS, Android, Blackberry OS, WebOS, Symbian, Bada
      • 275. http://aksw.org/Projects/MobileSocialSemanticWeb
      DSSN Mobile Client
    • 276.
    • 277. DSSN Mobile Browsing
    • 278. DSSN Mobile Editing
    • 279. LOD Lifecycle
      supported byDebian based
      LOD2 Stack
      (released next week)
    • 280. First releaseofthe LOD2 Stack: stack.lod2.eu & demo.lod2.eu/lod2demo
    • 281.
    • 282. AKSW Team
    • 283. Thanksforyourattention!
      Sören Auer
      http://www.uni-leipzig.de/~auer/ | http://aksw.org | http://lod2.org
      auer@uni-leipzig.de

    ×