• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
DBpedia - An Interlinking-Hub in the Web of Data
 

DBpedia - An Interlinking-Hub in the Web of Data

on

  • 2,656 views

Presentation by Georgi Kobilarov about DBpedia at the DC-2008 Wikimedia Workshop on User Generated Metadata

Presentation by Georgi Kobilarov about DBpedia at the DC-2008 Wikimedia Workshop on User Generated Metadata

Statistics

Views

Total Views
2,656
Views on SlideShare
2,656
Embed Views
0

Actions

Likes
1
Downloads
68
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    DBpedia - An Interlinking-Hub in the Web of Data DBpedia - An Interlinking-Hub in the Web of Data Presentation Transcript

    • An Interlinking-Hub in the Web of Data Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig Georgi Kobilarov, DBpedia at Dublin Core 2008
    • DBpedia  DBpedia.org is a community effort to  extract structured information from Wikipedia  make this information available on the Web under an open license  interlink the DBpedia dataset with other open datasets on the Web  Contributors  Freie Universität Berlin (Germany)  Universität Leipzig (Germany)  OpenLink Software (UK)  Linking Open Data Community (W3C SWEO) Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Extracting Structured Information from Wikipedia  Wikipedia consists of  11.2 million articles (2.5 million in English)  in 264 languages  monthly growth-rate: 4%  Wikipedia articles contain structured information  infoboxes which use a template mechanism  categorization of the article  images depicting the article’s topic  links to external webpages  intra-wiki links to other articles  inter-language links to articles about the same topic in different languages Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Domain specific Data Title Images Description Languages Infoboxes Web Links Categorization Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Multi-Lingual Abstracts  The dataset contains a short and a long abstract for each concept.  Short abstracts  English: 2,490,000  German: 391,000  French: 383,000  Dutch: 284,000  Polish: 256,000  Italian: 286,000  Spanish: 226,000  Japanese: 199,000  Portuguese: 246,000  Swedish: 144,000  Chinese: 101,000 Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Infobox Extraction dbpedia:BBC p:network_name „British Broadcasting Corporation (BBC)“ dbpedia:BBC p:country dbpedia:United_Kingdom dbpedia:BBC p:key_people dbpedia:Michael_Lyons Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Accessing the DBpedia Dataset over the Web 1. DB Dumps for Download 2. SPARQL Endpoint 3. Linked Data Georgi Kobilarov, DBpedia at Dublin Core 2008
    • The DBpedia SPARQL Endpoint  http://dbpedia.org/sparql  hosted on a OpenLink Virtuoso server  can answer SPARQL queries like  Give me all Sitcoms that are set in NYC?  All tennis players from Moscow?  All films by Quentin Tarentino?  All German musicians that were born in Berlin in the 19th century?  All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants? Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Linked Data  Use URIs as names for things  Use HTTP URIs so that people can look up those names.  When someone looks up a URI, provide useful information.  Include links to other URIs. so that they can discover more things. Georgi Kobilarov, DBpedia at Dublin Core 2008
    • URIs Wikipedia Article URI: http://en.wikipedia.org/wiki/BBC DBpedia Resource URI http://dbpedia.org/resource/BBC Georgi Kobilarov, DBpedia at Dublin Core 2008
    • W3C Linking Open Data Project  Community effort to  publish existing open license datasets as Linked Data on the Web  interlink things between different data sources Georgi Kobilarov, DBpedia at Dublin Core 2008
    • LOD Datasets on the Web: May 2007  Over 500 million RDF triples. Georgi Kobilarov, DBpedia at Dublin Core 2008
    • LOD Datasets on the Web: April 2008  Over 2 billion RDF triples. Georgi Kobilarov, DBpedia at Dublin Core 2008
    • LOD Datasets on the Web: September 2008 Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Linking Enterprise Data Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Structuring Wikipedia‘s Knowledge Currently under development Building a class hierarchy / ontology Mapping Wikipedia Templates to DBpedia classes Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Class Hierarchy  Build from scratch  170 classes  900 properties  Structuring actual data, not modeling the world  No AI terminology, no „living thing“ or „agent“ Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Template Mapping Class TV Episode (Work) Wikipedia Templates: Television Episode UK Office Episode Simpsons Episode DoctorWhoBox Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Parsers Handle Templates Values specifically Example: Property splitting Person born „1.1.1980, [[Berlin]]“ => split to birthplace Berlin birthdate 1980-01-01 Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Parsers Example: Class Rules MusicalArtist If property „currentMembers“ is set => Group Otherwise => Person Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Parsers Example: Range Validation Google keypeople „[[Eric Schmidt]] ([[CEO]], [[Chairman]]), [[Sergey Brin]], [[Larry Page]] Company#keyperson range Person#Class Googlekeyperson Eric Schmidt Sergey Brin Larry Page Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Class Hierarchy  200k people (70k athletes, 65k artists, 18k office holders)  193k places (100k areas, 40k cities, 10k rivers)  187k works (71k music albums, 24k singles, 31k films, 15k books)  87k species  70k organisations (20k educational institutions, 18k companies, 12k radio stations)  22k buildings (8k airports, 5k stations, 2k stadiums, 1k bridges)  12k planets  And more… (events, diseases, proteins, drugs, aircrafts, automobiles, ships, astronaut, architect, scientists) Georgi Kobilarov, DBpedia at Dublin Core 2008
    • Thanks http://dbpedia.org georgi.kobilarov@fu-berlin.de Georgi Kobilarov, DBpedia at Dublin Core 2008