Linked Data Tutorial
Upcoming SlideShare
Loading in...5
×
 

Linked Data Tutorial

on

  • 21,860 views

This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität ...

This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.

Statistics

Views

Total Views
21,860
Views on SlideShare
21,726
Embed Views
134

Actions

Likes
32
Downloads
767
Comments
6

10 Embeds 134

http://www.slideshare.net 68
http://www.informatik.uni-leipzig.de 42
http://aksw.org 8
http://www.linkedin.com 7
http://www.tu-chemnitz.de 3
http://www.chadcha.net 2
http://www.lmodules.com 1
http://translate.googleusercontent.com 1
http://www.e-presentations.us 1
http://localhost:3000 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

15 of 6 Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • merci
    Are you sure you want to
    Your message goes here
    Processing…
  • Great Intro Preso on how to marry the traditional DB with Semantic Info
    Are you sure you want to
    Your message goes here
    Processing…
  • That is very excellent presentation

    http://www.fakhriramley.com
    http://winkhealth.com
    Are you sure you want to
    Your message goes here
    Processing…
  • excellent presentation,
    http://www.after.co.in
    Are you sure you want to
    Your message goes here
    Processing…
  • http://www.fioricetsupply.com is the place to resolve the price problem. Buy now and make a deal for you.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Linked Data Tutorial Presentation Transcript

  • 1. From Document Web to a Web of Linked Data Dr. S ö ren Auer AKSW, Institut f ü r Informatik
  • 2. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Open Street Maps – linked open geo data
    Linked Data Tutorial
  • 3. From the Document Web to the Linked Open Data Web (and beyond) Linked Data Tutorial
    • Web (since 1992)
    • HTTP
    • HTML/CSS/JavaScript
    • Semantic Web (Vision 1998, starting ???)
    • Reasoning
    • Logic, Rules
    • Trust
    • Social Web (since 2003)
    • Folksonomies/Tagging
    • Reputation, sharing
    • Groups, relationships
    • Data Web (since 2006)
    • URI de-referencability
    • CBD
    • RDF serializations
  • 4. Conceptual Level Data Access and Integration Linked Data Tutorial
    • Object-relational mappings (ORM)
    • NeXT’s EOF / WebObjects
    • ADO.NET Entity Framework
    • Hibernate
    • Entity-attribute-value (EAV)
    • HELP medical record system, TrialDB
    • Column-oriented DBMS
    • Collocates column values rather than row values
    • Vertica, C-Store, MonetDB
    • Data Web
    • URIs as entity identifiers
    • HTTP as data access protocol
    • Local-As-View (LAV)
    • RDBMS
    • Organize data in relations, rows, cells
    • Oracle, DB2, MS-SQL
    • Triple/Quad Stores
    • RDF data model
    • Virtuoso, Oracle, Sesame
    Data Models
    • Others
    • XML, hierachical, tree, graph-oriented DBMS
    • Procedural APIs
    • ODBC
    • JDBC
    Data Access
    • Query Languages
    • Datalog, SQL
    • SPARQL
    • XPATH/XQuery
    Data Integration
    • Linked Data
    • de-referencable URIs
    • RDF serialization formats
    Enterprise Information Integration sets of heterogeneous data sources appear as a single, homogeneous data source
    • Data Warehousing
    • Based on extract, transform load (ETL)
    • Global-As-View (GAV)
    Research Mediators Ontology-based P2P Web service-based
  • 5. Web 1.0 Web 2.0 Web 3.0 Many Web sites containing unstructured, textual content Few large Web sites are specialized on specific content types Many Web sites containing & semantically syndicating arbitrarily structured content Pictures Video Encyclopedic articles + + Linked Data Tutorial
  • 6. The Long Tail of Information Domains Pictures News Video Recipes Calendar Currently supported structured content types SemWeb supported structured content Gene sequences Itinerary of King George Talent management Popularity Not or insufficiently supported content types The Long Tail by Chris Anderson ( Wired , Oct. ´ 04) adopted to information domains … … Requirements- Engineering … … Special interest communities Linked Data Tutorial
  • 7. Why Do We Need Another Web?
    • Try to search for these things on the current Web:
    • Apartments near German-French bilingual childcare in Leipzig.
    • ERP service providers with offices in Vienna and Berlin.
    • Researchers working on DB related topics in south-east Asia.
    • Information to answer such search queries is available on the Web, but opaque to current Web search .
    • (Semantic) Data Web allows to complement text on Web pages with structured data and to intelligently combine and integrate such structured information from different sources:
    Web server Web server Linked Data Tutorial Leipzig.de Has everything about childcare in L.e. Immobilienscout.de Knows all about real estate offers in Germany DB Web server DB Web server Search engine HTML HTML RDF RDF
  • 8. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Virtuoso – Knowledge Store
    • Open Street Maps – free and open geo data
    Linked Data Tutorial
  • 9. RDF - Resource Description Framework
    • Distinguishes two fundamental base types :
    • Resources
    • Complex abstract or concret entities
    • Uniquely identified by an URI:
      • http://DBpedia.org/resource/Vienna
    • Literals
    • concrete data values
    • Optionally typed (e.g. xsl:string , xsl:dateTime etc.) or language (e.g. en , de ):
      • " 2008-05-31T09:30:00 " ^^xsd:dateTime
      • " Wien " @ " de "
    Linked Data Tutorial
  • 10. RDF Statement / Triple Paradigm
    • RDF/XML:
    • <?xml version=&quot;1.0&quot;?>
    • < rdf:RDF
    • xmlns=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot; xmlns:dc=&quot;http://purl.org/metadata/dublin_core#&quot;>
    • < Description about =&quot; http://OntoWiki.net &quot;>
    • < dc:Creator >Sö ren Auer < /DC:Creator >
    • </Description >
    • </rdf:RDF>
    Linked Data Tutorial http://OntoWiki.net Sö ren Auer dc:creator Subject (Resource) Predicate (Resource) Object (Resource/Literal) RDF/N3: http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator &quot;Sö ren Auer “
  • 11. RDF Document / Model / Graph
      • Simple Knowledge Base
      • Combines multiple RDF Statements
    Linked Data Tutorial [email_address] http://OntoWiki.net http://aksw.org/staff/Soeren dc:Creator Sö ren Auer foaf:Email foaf:Name
  • 12. RDF Serialization
    • <?xml version=&quot;1.0&quot;?>
    • < rdf:RDF
    • xmlns=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot; xmlns:dc=&quot;http://purl.org/metadata/dublin_core#&quot;>
    • < rdf:Description about=&quot;http://OntoWiki.net&quot;>
    • <dc:Creator>
    • < rdf:Description>
    • < rdf:Description about=&quot;http://aksw.org/staff/Soeren&quot;>
    • <dc:Name>Sö ren Auer </dc:Name>
    • <dc:Email>auer@informatik.uni-leipzig.de</dc:Email>
    • < /rdf:Description >
    • </dc:Creator>
    • < /rdf:Description >
    • < /rdf:RDF >
    Linked Data Tutorial http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator http://aksw.org/staff/Soeren http://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Name &quot;Sö ren Auer &quot; http://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Email [email_address] [email_address] http://OntoWiki.net http://aksw.org/staff/Soeren Creator Sö ren Auer Email Name
  • 13. RDF Schema
    • Restrict combinations of resources / literals
    • Structuring of vocabularies
    • Instantiation / classification
    • Provisioning of special resources:
    • Classes (concepts, frames) http://www.w3.org/2000/01/rdf-schema#Class
    • Attributes (properties, slots, roles) http://www.w3.org/2000/01/rdf-schema#Property
    • Instances (objects) http://www.w3.org/1999/02/22-rdf-syntax-ns#type
    Linked Data Tutorial http://OntoWiki.net 16.11.2007 dc:creator ?
  • 14. RDF-S Class & Property Hierarchies
    • Beer rdf:type rdfs:Class
    • BottomFermentedBeer rdfs:subClassOf Beer
    • Bock rdfs:subClassOf BottomFermentedBeer
    • Lager rdfs:subClassOf BottomFermentedBeer
    • Pilsner rdfs:subClassOf BottomFermentedBeer
    Linked Data Tutorial hasContent rdf:type rdfs:Property hasAlcoholicContent rdfs:subPropertyOf Beer hasOriginalWortContent rdfs:subClassOf BottomFermentedBeer
  • 15. RDF-S Properties
    • … are defined and used independently from classes
    • Domain: Association with one or multiple classes
    • Range: defines values the property can assume
      • Instances of a certain class
      • literals typed with a certain XML schema data type
    Linked Data Tutorial hasAlcoholicContent rdf:type owl:DatatypeProperty hasAlcoholicContent rdf:type owl:FunctionalProperty hasAlcoholicContent rdfs:domain Beer hasAlcoholicContent rdfs:range xsd:float hasAlcoholicContent rdfs:subPropertyOf hasContent brews rdf:type owl:ObjectProperty brews rdfs:domain  Brewery brews rdfs:range Beer
  • 16. RDF-S Instances
    • Are associated to one (or multiple) class(es) :
    Linked Data Tutorial Boddingtons rdf:type Ale Grafentrunk rdf:type Bock Hoegaarden rdf:type White Jever rdf:type Pilsner
  • 17. Semantic Web Layer Cake Linked Data Tutorial
  • 18. Linked Data - Paradigm
    • Use URIs as names for things
    • Use HTTP URIs so that people can look up those names.
    • When someone looks up a URI, provide useful information.
    • Include links to other URIs. so that they can discover more things.
  • 19. Linked Data – Publishing RDF
    • De-referenceable RDF-URIs, e.g.: http://dbpedia.org/resource/Busan
    • Different HTTP response depending on HTTP-Accept-Header
    Linked Data Tutorial
  • 20. Benefits of using the RDF Data Model in the Linked Data Context
    • Clients can look up every URI in an RDF graph over the Web to retrieve additional information.
    • Information from different sources merges naturally.
    • The data model enables you to set RDF links between data from different sources.
    • The data model allows you to represent information that is expressed using different schemata in a single model.
    • Combined with schema languages such as RDF-S or OWL, the data model allows you to use as much or as little structure as you need, meaning that you can represent tightly structured data as well as semi-structured data.
    Linked Data Tutorial
  • 21. Linking Open Data (LOD) Cloud Linked Data Tutorial
  • 22. Data Web Moving Targets
    • Base technologies (RDF, SPARQL, HTTP etc.) are developed, standardized and ready to use
    • Big issues:
    • Scalability
    • User interfaces
    • Search engines
    • Business models
    • (Reasoning)
    Linked Data Tutorial
  • 23. Data Web Business Models
    • Advertisement (page view) based businesses will probably not be first movers 
    • Large Web companies will probably not be first movers 
    • Data Web should focus on fragmented markets with many players which require widest distribution of information , e.g. realtors, online shops, transportation service providers, public information, geo data etc.
    Linked Data Tutorial
  • 24. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Open Street Maps – free and open geo data
    Linked Data Tutorial
  • 25. Triplify Motivation
    • growth of semantic representations still outpaced by the traditional Web
    • overcome the chicken-and-egg dilemma of missing semantic representations and search facilities on the Web
    • Triplify leverages relational representations behind existing Web applications:
      • often open-source, deployed hundred thousand times
      • structure and semantics encoded in relational database schemes (behind Web apps) is not accessible to Web search engines, mashups etc.
    Linked Data Tutorial Monthly Web application downloads at Sourceforge
  • 26. Triplify Big Picture Linked Data Tutorial
  • 27. Triplify Approach: Simplicity
    • Expose semantics as simple as possible
      • No (new) mapping languages
      • Few lines of code – easy to plug-in
      • Simple, reusable configurations
    • Available for most popular Web app languages
      • PHP (ready), Ruby/Python under development
    • Works with most popular Web app DBs
      • MySQL (extensively tested), PHP-PDO DBs (SQLite, Oracle, DB2, MS SQL, PostgreSQL etc.) should work, not needed for Virtuoso 
    • Triplify exposes RDF/Ntriples, LinkedData and RDF/JSON
    Linked Data Tutorial
  • 28. Triplify Solution: SQL-SELECT queries map relational data to RDF
    • Triplify Configuration:
    • number of  SQL queries selecting information, which should be made publicly available.
    • Special SQL query result structure required (in order to convert results into RDF:
    • first column must contain identifiers for generating instance URIs (i.e. the primary key of DB table)
    • column names are used to generate property URIs , renaming columns allows to reuse properties from existing vocabularies such as Dublin Core, FOAF, SIOC
      • e.g. SELECT id, name AS ' foaf:name ' FROM users
    • individual cells contain data values or references to other instances (eventually constitute the objects of resulting triples)
    Linked Data Tutorial
  • 29. Example: Wordpress Blog Posts
    • Associate the URL path fragment 'post‘ with a number of SQL patterns:
    • http://blog.aksw.org/triplify/post/(xxx)
    • SELECT  id, post_author  AS 'sioc:has_creator->user' , post_title  AS 'dc:title', post_content  AS 'sioc:content', post_date  AS 'dcterms:modified^^xsd:dateTime‘, post_modified  AS 'dcterms:created^^xsd:dateTime'
    • FROM  posts
    • WHERE  post_status='publish‘ ( AND id=xxx)
    • SELECT  post_id id, tag_label  AS 'tag:taggedWithTag‘
    • FROM  post2tag INNER JOIN tag ON( post2tag.tag_id=tag.tag_id )
    • ( WHERE  id=xxx)
    • SELECT  post_id id, category_id  AS 'belongsToCategory->category‘
    • FROM  post2cat
    • ( WHERE  id=xxx)
    Linked Data Tutorial Object property Datatype property 1 2 3
  • 30. RDF Conversion Linked Data Tutorial http://blog.aksw.org/triplify/post/1 sioc:has_creator http://blog.aksw.org/triplify/user/5 http://blog.aksw.org/triplify/post/1 dc:title “New DBpedia release” http://blog.aksw.org/triplify/post/1 sioc:content “Today we released …” http://blog.aksw.org/triplify/post/1 dcterms:modified “20081020T1635”^^xsd:dateTime http://blog.aksw.org/triplify/post/1 dcterms:created “20081020T1635”^^xsd:dateTime http://blog.aksw.org/triplify/post/1 tag:taggedWithTag “DBpedia” http://blog.aksw.org/triplify/post/1 tag:taggedWithTag “Release” http://blog.aksw.org/triplify/post/1 belongsToCategory http://blog.aksw.org/triplify/category/34 1 2 3 http://blog.aksw.org/triplify/post/1 id post_author post_title post_content post_date post_modified 1 5 New DBpedia release Today we released … 200810201635 200810201635 id tag:taggedWithTag 1 DBpedia 1 Release .. id belogsToCategory 1 34 …
  • 31. Example Config
    • <?php include('../wp-config.php'); $triplify['namespaces'] =array(     'vocabulary'=>'http://triplify.org/vocabulary/Wordpress/',     'foaf'=>'http://xmlns.com/foaf/0.1/', … ); $triplify['queries'] =array(     'post'=>array(         &quot; SELECT  id,post_author 'sioc:has_creator->user',post_date 'dcterms:created',post_title 'dc:title', post_content 'sioc:content',                 post_modified 'dcterms:modified‘ FROM  {$table_prefix}posts WHERE post_status='publish'&quot;,         &quot; SELECT  post_id id,tag_id 'tag:taggedWithTag'  FROM  {$table_prefix}post2tag&quot;,         &quot; SELECT  post_id id,category_id 'belongsToCategory'  FROM  {$table_prefix}post2cat&quot;,     ),     'tag'=>&quot; SELECT  tag_ID id,tag 'tag:tagName'  FROM  {$table_prefix}tags&quot;,     'category'=>&quot; SELECT  cat_ID id,cat_name 'skos:prefLabel',category_parent 'skos:narrower'  FROM  {$table_prefix}categories&quot;,     'user'=>array(         &quot; SELECT  id,user_login 'foaf:accountName', SHA(CONCAT ('mailto:',user_email)) 'foaf:mbox_sha1sum',                 user_url 'foaf:homepage',display_name 'foaf:name' FROM  {$table_prefix}users&quot;,         &quot; SELECT  user_id id,meta_value 'foaf:firstName'  FROM  {$table_prefix}usermeta  WHERE  meta_key='first_name'&quot;,         &quot; SELECT  user_id id,meta_value 'foaf:family_name'  FROM  {$table_prefix}usermeta  WHERE  meta_key='last_name'&quot;,     ),     'comment'=>&quot; SELECT  comment_ID id,comment_post_id 'sioc:reply_of',comment_author  AS  'foaf:name',              SHA(CONCAT ('mailto:',comment_author_email)) 'foaf:mbox_sha1sum', comment_author_url 'foaf:homepage',
    • comment_date  AS   'dcterms:created', comment_content 'sioc:content',comment_karma,comment_type          FROM  {$table_prefix}comments  WHERE  comment_approved='1'&quot;, ); $triplify['objectProperties'] =array(     'sioc:has_creator'=>'user', 'tag:taggedWithTag'=>'tag', 'belongsToCategory'=>'category‘,'skos:narrower'=>'category','sioc:reply_of'=>'post'); $triplify['classMap'] =array('user'=>'foaf:person', 'post'=>'sioc:Post', 'tag'=>'tag:Tag', 'category'=>'skos:Concept'); $triplify['TTL'] =0; // Caching $triplify['db'] =new PDO('mysql:host='.DB_HOST.';dbname='.DB_NAME,DB_USER,DB_PASSWORD);
    • ?>
    Linked Data Tutorial
  • 32. Triplify Temporal Extension
    • Problem: How do next generation search engines know something changed on the Data Web?
    • Different solutions:
    • Try to crawl always everything : currently deployed on the Web
    • Ping a central update notification service: PingTheSemanticWeb.com – will probably not scale if the Data Web gets really deployed
    • Each linked data endpoint publishes an update log: Triplify Update Logs
    Linked Data Tutorial
  • 33. Triplify Temporal Extension
    • http://example.com/Triplify/update
    • http://example.com/Triplify/update/2007 rdf:type update:UpdateCollection .
    • http://example.com/Triplify/update/2008 rdf:type update:UpdateCollection .
    • http://example.com/Triplify/update/2008
    • http://example.com/Triplify/update/2008/Jan rdf:type update:UpdateCollection .
    • http://example.com/Triplify/update/2008/Feb rdf:type update:UpdateCollection .
    • Nesting continues until we finally reach an URL, which exposes all updates performed in a certain second in time…
    • http://example.com/Triplify/update/2008/Jan/01/17/58/06
    • http://example.com/Triplify/update/2008/Jan/01/17/58/06/user123
    • update:updatedResource http://example.com/Triplify/users/JohnDoe ;
    • update:updatedAt &quot;20080101T17:58:06&quot;^<xsd:dateTime> ;
    • update:updatedBy http://example.com/Triplify/users/JohnDoe .
    Linked Data Tutorial special update path and vocabulary
  • 34. Triplify Spatial Extension
    • How to publish geo-data using Triplify?
    • OpenStreetMaps – 160 GB Geo Data lots of POIs – hotels, gas stations, universities …
    • http://LinkedGeoData.org/near/48.213056,16.359722/1000/Hotel
    • http://LinkedGeoData.org/point/212331
    • http://LinkedGeoData.org/point/944523
    • http://LinkedGeoData.org/point/234091
    Linked Data Tutorial Lon Lat Radius Tag
  • 35. RDB2RDF tool comparison Linked Data Tutorial More at: http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt Tool Triplify R2DQ Virtuoso RDF Views Technology Scripting languages (PHP) Java Whole middleware solution SPARQL endpoint - X X Mapping language SQL RDF based RDF based Mapping generation Manual Semi-automatic Manual Scalability Medium-high (but no SPARQL) medium High
  • 36. Marrying DBs with RDF & Ontologies
    • Using DBs for storage and querying of RDF & ontologies
    Linked Data Tutorial Publishing DB content as RDF Relational Databases RDF & Ontologies Data Model Relational (tables, columns, rows) Triples (subject, predicate, object) Schema and data separation   Implicit information   Scalability   Schema flexibility   Web data integration readiness  
  • 37. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Open Street Maps – free and open geo data
    Linked Data Tutorial
  • 38. Transforming Wikipedia into a Knowledge base
    • ☺ Wikipedia is the 8th most popular website (according to Alexa.com)
    • ☺ Maybe the finest example of truly collaboratively created content (>8M articles in >200 languages written by >300.000 authors)
    • ☺ Covers all possible topics and domains, articles are a result of a “community consensus”
    • Θ Many inconsistencies can be found on different pages/language versions
    • Θ Not very well integrated with other data sources
    • Θ Lacks structured representations of content which facilitate querying and search
    • Simple Questions – hard to answer:
    • What have the Art Nouveau and Berlin in common ?
    • Who are mayors of central European towns elevated more than 1000m ?
    • Which films are longer than 4 hours and had a budget of less than $1 Million ?
    • The information required to answer these is contained in Wikipedia !
    • How can we reveal structure and semantics of Wikipedia content?
    Linked Data Tutorial
  • 39. Structure in Wikipedia
    • Title
    • Abstract
    • Infoboxes
    • Geo-coordinates
    • Categories
    • Images
    • Links
      • other language versions
      • other Wikipedia pages
      • To the Web
      • Redirects
      • Disambiguations
    Linked Data Tutorial
  • 40. Infobox templates
    • {{Infobox Korean settlement
    • | title = Busan Metropolitan City
    • | img = Busan.jpg
    • | imgcaption = A view of the [[Geumjeong]] district in Busan
    • | hangul = 부산 광역시
    • ...
    • | area_km2 = 763.46
    • | pop = 3635389
    • | popyear = 2006
    • | mayor = Hur Nam-sik
    • | divs = 15 wards (Gu), 1 county (Gun)
    • | region = [[Yeongnam]]
    • | dialect = [[Gyeongsang]]
    • }}
    • http://dbpedia.org/resource/Busan
    • dbp:Busan dbpp:title ″Busan Metropolitan City″
    • dbp:Busan dbpp:hangul ″ 부산 광역시 ″ @Hang
    • dbp:Busan dbpp:area_km2 ″763.46“^xsd:float
    • dbp:Busan dbpp:pop ″3635389“^xsd:int
    • dbp:Busan dbpp:region dbp:Yeongnam
    • dbp:Busan dbpp:dialect dbp:Gyeongsang
    • ...
    Wikitext-Syntax RDF representation Linked Data Tutorial
  • 41. Class Hierarchy
    • 200k people (70k athletes, 65k artists, 18k office holders)
    • 193k places (100k areas, 40k cities, 10k rivers)
    • 187k works (71k music albums, 24k singles, 31k films, 15k books)
    • 87k species
    • 70k organisations (20k educational institutions, 18k companies, 12k radio stations)
    • 22k buildings (8k airports, 5k stations, 2k stadiums, 1k bridges)
    • 12k planets
    • And more… (events, diseases, proteins, drugs, aircrafts, automobiles, ships, astronaut, architect, scientists)
  • 42. Extraction results
    • Extraction algorithm with the English Wikipedia content ( http://dumps.wikimedia.org/enwiki )
    • <1h needed to extract templates and convert them to RDF (>2M English Wikipedia articles, >10GB raw data)
    • roughly 30M facts extracted from infobox templates alone
    • Sample checks reveal: ~ 90% accuracy , 9% redundant information, 1% erroneous
    • multi-domain ontology covering a large body of domains
    • extraction results and source code of the extraction algorithm available at http://dbpedia.org
    Linked Data Tutorial Dataset (en) Triples Articles 7.6M Abstracts 2.1M External Links 3.2M Categories 7.3M Infoboxes 29.3M Persons 560k Yago Classes 2M Wordnet Classes 338k Geo-coordinates 450k Mapping to Flickr, DBLP, Eurostat, CIA-Factbook, Musicbrainz, Project Gutenberg, US Census, … 100k Mapping to OpenCyc 45k
  • 43. DBpedia Components Wikipedia Dumps Article texts DB tables Infobox Articles Categories … DBpedia datasets SPARQL Endpoint Query Builder SNORQL Browser Traditional Web Browser Web 2.0 Mashups Virtuoso MySQL Extraction loaded into published via … Linked Data … Semantic Web Browsers OpenCyc Wordnet Freebase Geonames … … … interlinked with other open data Linked Data Tutorial
  • 44. User Interfaces Linked Data Tutorial
  • 45. DBpedia SPARQL Endpoint (1)
    • http://dbpedia.org/sparql
    • hosted on a OpenLink Virtuoso server
    • can answer SPARQL queries like
      • Give me all Sitcoms that are set in NYC?
      • All tennis players from Moscow?
      • All films by Quentin Tarentino?
      • All German musicians that were born in Berlin in the 19th century?
      • All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants?
  • 46. DBpedia SPARQL Endpoint (2)
    • SELECT ?name ?birth ?description ?person WHERE {
    • ?person dbp:birthPlace dbp:Berlin .
    • ?person skos:subject dbp:Cat:German_musicians .
    • ?person dbp:birth ?birth .
    • ?person foaf:name ?name .
    • ?person rdfs:comment ?description .
    • FILTER (LANG(?description) = 'en') .
    • } ORDER BY ?name
    Linked Data Tutorial
  • 47. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Virtuoso – Knowledge Store
    • Open Street Maps – free and open geo data
    Linked Data Tutorial
  • 48. OntoWiki
    • Semantic Wiki
    • Differences
    • Similarities
    • Architecture
    • Use Cases
    Linked Data Tutorial
  • 49. Semantic Wiki
    • Wiki with added semantics
    • Goal: Wiki pages + background knowledge base
    • Examples: Semantic MediaWiki , Rhizome, IkeWiki
    Linked Data Tutorial
  • 50. Conceptual Differences: Views over Articles Wiki articles Linked Data Tutorial Resource views
  • 51. Conceptual Differences: Forms over Code Wiki code Linked Data Tutorial Forms
  • 52. Conceptual Similarities: Wikiwiki Concepts
    • Everyone can edit anything
    • Content is edited in the same way as structure is
    • Activity can be watched and reviewed by everyone
    Ward Cunningham Linked Data Tutorial
  • 53. Versioning
    • Everything can be undone
    • Philosophy: make it easy to correct mistakes
    Linked Data Tutorial
  • 54. OntoWiki Application Framework: Interfaces
    • SPARQL Endpoint
    • Linked Data Endpoint
    • WebDAV
    • REST API
    • Command Line Interface
    • LDAP
    Linked Data Tutorial
  • 55. Extensibility
    • Plugins
    • Views/Templates
    • Themes
    • Localizations
    Linked Data Tutorial
  • 56. Access Control
    • Model-based
    • Action-based
    • (Statement-based)
    Linked Data Tutorial
  • 57. Other Features
    • Facet-based browsing
    • Inline editing
    • Auto-adaptive user interface
    • Resource auto-suggestion
    • SPARQL Query Editor
    Linked Data Tutorial
  • 58. Architecture Linked Data Tutorial
  • 59. Vision
    • Generic data wiki for RDF models
      • no data model mismatch (structured vs. unstructured)
    • Application framework for:
      • Knowledge-intensive applications
      • Agile processes
      • Distributed user groups
    Linked Data Tutorial
  • 60. SoftWiki* Linked Data Tutorial Problem: Requirements Engineering with large, spatially distributed stakeholder groups Solution: comprehensive ontology for representing RE relevant knowledge + adapted OntoWiki application Application of text-mining methods for duplicate detection * Work in BmbF funded project with UniDuE, T-Systems, QA-Systems, LeCoS, ProDV
  • 61. Linked Data Tutorial
  • 62. Caucasian Spiders
    • Faunistic database on spiders of the Caucasus
    • Taxonomy
    • Localities
    • 240k triples
    Linked Data Tutorial
  • 63. Linked Data Tutorial
  • 64. Professor Catalogue
    • Professor catalogue with 800 entries and 60 schema elements
    • OntoWiki used as backend for data entry
    • Custom front-end
    Linked Data Tutorial
  • 65. Linked Data Tutorial
  • 66. Linked Data Tutorial
  • 67. Semantic Wikis: Related Work Linked Data Tutorial OntoWiki Semantic MediaWiki IkeWiki Main developer Uni Leipzig AKSW AIFB Karlsruhe Salzburg Research Technology PHP/MySQL
      • PHP/MySQL (MediaWiki extension)
    Java/Postgres Base artifacts Facts
      • (annotated) texts
    (annotated) texts Authoring WYSIWIG facts / forms Wiki syntax / semantic forms WYSIWIG / forms Other Data Web development framework Planned Wikipedia deployment Visual KB browser
  • 68. Vakantieland*
    • One of the largest tourist information sites in NL (>100.000 daily page views, >20.000 points of interest)
    • Traditional relational DB system was to inflexible to capture the increasingly heterogeneous content types
    • Development of an OntoWiki based Data Web application
    • Geo-data integration from OpenStreetMaps
    • Semantic-Search
    • Integration of DBpedia data
    • Comprehensive performance tuning
    • * work with Ceriel Jakobs, Michael Martin partially funded by SenterNovem
    Linked Data Tutorial
  • 69. Overview
    • The Linked Data Web Vision
    • Data Web Technologies
    • Publishing relational data on the Web
    • DBpedia – transforming Wikipedia into a knowledge base
    • OntoWiki – an Linked Data Wiki
    • Open Street Maps – linked open geo data
    Linked Data Tutorial
  • 70. Linked Open Geo Data
    • Spatial data is crucial for the Data Web in order to interlink geographically linked resources.
    • Open Street Map project (OSM) collects, organizes and publishes geo data the wiki way:
    • 80.000 OSM users collected data about 22M km ways (roads, highways etc.) on earth , 25T km are added daily
    • OSM contains a vast amount points-of-interest descriptions e.g. shops, amenities, sports venues, businesses, touristic and historic sights.
    • Goal: publish OSM geo data, interlink it with other data sources and provide efficient means for browsing and authoring:
    • Open Street Map data extraction works on the basis of OSM database dumps, a bi-directional live integration of OSM and our Linked Geo Data browser and editor is currently in the works.
    • Triplify spatial data publishing , the Triplify script for publishing linked data from relational databases is extended for publishing geo data, in particular with regard to the retrieval of information about geographical areas.
    • LinkedGeo Data browser and editor is a facet-based browser for geo content, which uses an OLAP inspired hypercube for quickly retrieving aggregated information about any user selected area on earth.
    Linked Data Tutorial
  • 71. Faceted Linked-Geo-Data Browser Linked Data Tutorial
  • 72. AKSW Linked Data Web Building Blocks DBpedia “ Semantification” of Wikipedia Linked Data Tutorial Triplify “ Semantification” of (small) Web Applications OntoWiki Collaborative creation of explicit knowledge via Semantic Wikis OWLDB Extending DBs for ontology handling / revealing implicit information Vakantieland Building Data Web applications SoftWiki Distributed, stakeholder driven Requirements Engineering Foundations Marrying databases with RDF and ontologies Tools Applications Bringing the Data Web to end users
      • RDF Query Subsumption & View Maintenance
      • Scaling database backed Triple Stores
    xOperator Combining Instant Messaging with the Data Web OpenResearch.org A semantic Wiki for the sciences … DL-Learner Machine Learning for Ontologies
  • 73. Thanks!
    • Dr. S ö ren Auer
    • [email_address]
    • Research group Agile Knowledge Engineering & Semantic Web (AKSW): http://aksw.org
    • http://Triplify.org
    • http://DBpedia.org
    • http://OntoWiki.net
    • http://OpenResearch.org
    • http://aksw.org/projects/xOperator
    • DL-Learner.org
    • Cofundos.org
    Linked Data Tutorial