• Like
  • Save
A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Berlin 2012
Upcoming SlideShare
Loading in...5
×
 

A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Berlin 2012

on

  • 1,403 views

A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight ...

A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight

Presented at SemTech Berlin 2012

Wikipedia is one of the most important repositories of human knowledge, containing millions of interlinked articles. The DBpedia project extracts and combines Wikipedia information into a large multilingual knowledge base that enables semantic processing in a wide range of applications. We have built DBpedia Spotlight, a tool that recognizes ambiguous terms in text and automatically assigns unambiguous definitions to those terms by connecting them to DBpedia. Such interconnection enriches information by providing explicit semantic relationships, enabling semantic indexing, faceted exploration, among other data processing enhancements. In this talk we will describe how DBpedia Spotlight can be applied to establish a virtuous cycle of semantic enhancement. On the one hand, it can enhance knowledge interconnectivity in document collections. On the other hand, it learns how to better annotate from user feedback. Such a positive feedback loop can be applied on Wikipedia itself, or in enterprises to alleviate the cold start problem and knowledge management costs.

Statistics

Views

Total Views
1,403
Views on SlideShare
1,374
Embed Views
29

Actions

Likes
3
Downloads
15
Comments
0

3 Embeds 29

http://www.planet-data.eu 16
http://planet-data.eu 12
http://planet-data.org 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Berlin 2012 A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight - SemTech Berlin 2012 Presentation Transcript

    • A Virtuous Cycle ofSemantic Enhancement with DBpedia Spotlight Pablo N. Mendes, Christian Bizer pablo.mendes@fu-berlin.de Web Based Systems Group Freie Universität Berlin SemTechBiz Berlin February 6th 2011
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Agenda • What do we mean by semantic enhancement? • How does DBpedia Spotlight work? • From Wikipedia to DBpedia Spotlight • From DBpedia Spotlight to Wikipedia • In your project Can you also enable a virtuous cycle? Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Semantic Enhancement? • Generally: – Making something easier to understand • For humans: – Say what you mean (reduce ambiguity) – Make associations – Access to definitions, background • For machines: – the same, but in structured format Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Semantic Enhancement (Example) http://nyti.ms/qsYAyt  News Annotation  Links to “topics”  Topic pages lead to related content  Semantic Enhancement  Links text to unique identifiers  Adds background information  Interconnects related content Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de A Virtuous Cycle of Semantic Enhancement Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de DBpedia Spotlight • DBpedia is a collection of entity descriptions extracted from Wikipedia & shared as linked data • DBpedia Spotlight uses data from DBpedia and text from associated Wikipedia pages • Learns how to recognize that a DBpedia resource was mentioned • Given plain text as input, generates annotated text Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de DBpedia Spotlight: Text Annotation • From: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. • To: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. http://dbpedia.org/resource/New_York_City http://dbpedia.org/resource/Apple_Corps Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Challenge: Term Ambiguity • ...this apple on the palm of my hand... • ...Apple tried to acquire Palm Inc.... • ...eating an apple sitting next to a palm tree... • What do “apple” and “palm” mean in each case? • Our objective is to recognize entities/topics and disambiguate their meaning, generating DBpedia annotation in text. Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Stage 1: Spotting • Find substrings that seem worthy of annotation Input: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. Output: “Lennon”, “McCartney”, “New York”, “Apple Corps” • Simplest approach relies on a dictionary of known entity names. – Other: Named Entity Recognition, Keyphrase Extraction, ... Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Stage 2: Candidate Mapping • Find possible meanings for each of the spotted substrings. Input (spotted names): “Lennon”, “McCartney”, “New York”, “Apple Corps” Output (candidate map): “Lennon”: { Lennon_(album), Lennon,_Michigan, … } “McCartney”: { McCartney(surname), Paul_McCartney, … } “New York”: { New_York_State, New_York_City, … } “Apple Corps”: { Apple_Corps } Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Stage 3: Disambiguation • Select the correct candidate DBpedia Resource for a given substring. • Decision is made based on the context(1) the substring was mentioned con·text (kntkst)n. 1. the parts of a discourse that surround a word or passage and can throw light on its meaning http://mw1.merriam-webster.com/dictionary/context Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Learning the Context for a resource (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. • Collect context for DBpedia Resources from all articles in Wikipedia e.g. Co-occurrence Statistics John_Lennon = {John:981, Beatles:320, McCartney:100, ...} • Types of context – Wikipedia Pages – Definitions from disambiguation pages – Paragraphs that link to resources Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de DBpedia Spotlight http://spotlight.dbpedia.org/demo Freely available Web Service; Open Source, Java/Scala Apache V2 License. Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de A Virtuous Cycle Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de The “Suggest” Button User decides to add a link Suggest Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de The “Suggest” Button System suggest targets, user chooses Suggest Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Sztakipedia http://pedia.sztaki.hu • Developed by Mihály Héder et al. at MTA SZTAKI (Hungarian Academy of Sciences) • Adds a toolbar to Wikipedia that can use DBpedia Spotlight to suggest links – Also suggests Categories, Infoboxes, Books • Helps editors to refine knowledge in Wikipedia – More interconnections, more entity types, more structured data! Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Sztakipedia (screenshots) http://pedia.sztaki.hu Source: http://www.youtube.com/watch? v=8VW0TrvXpl4 Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Beyond Wikipedia? RDFaCE http://rdface.aksw.org/ • Developed by Ali Khalili at U. Leipzig (AKSW) • Helps users to add RDFa markup via a WYSIWYG interface • Can use DBpedia Spotlight, among other services to disambiguate entity names • Available as a Wordpress Plugin – Enables blogs as sources of context for disambiguation Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de RDFaCE http://rdface.aksw.org Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de A Virtuous Cycle in My Enterprise • Select a database of entity identifiers • Select textual sources that talk about those entities • Use semantic enhancement editors (with automatic suggestions) to annotate text • Use annotated text to re-train annotator DBpedia Spotlight is ready for MediaWiki XML and TSV. Other formats to come! Take a look at NIF http://nlp2rdf.org/nif-1-0 Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Semantic Enhancement Marketplace • User-generated annotations are valuable crowdsourced knowledge • Can be used as currency: – “sweat for web service provision” – b2b partnerships • Example: RoboTagger.com – entity annotation service (in German) – entity types are not fixed (crowd-sourced) – users rewarded with more access to web service Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Conclusion • Unstructured information (text) and structured information (e.g. RDF) – Mutually dependent and beneficial • DBpedia Spotlight sits on the border of two worlds: – From Wikipedia, an automatic annotator – From the auto-annotator, a more interconnected Wikipedia • Fosters a semantic enhancement ecosystem! Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Thank you! On Twitter: @pablomendes E-mail: pablo.mendes@fu-berlin.de Web: http://pablomendes.com http://slideshare.net/pablomendes • Special thanks to Mihály Héder and Iavor Jelev for many fruitful discussions • DBpedia Spotlight is partially funded by LOD2.eu http://spotlight.dbpedia.org Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight
    • FREIE UNIVERSITÄT BERLINSemTechBiz Berlin, February 2012 http://wbsg.de Links • Download – DBpedia: http://dbpedia.org/Downloads37/ – DBpedia Spotlight: http://sourceforge.net/projects/dbp-spotlight/ – RDFaCE • http://code.google.com/p/rdface/ • Wordpress plugin: http://wordpress.org/extend/plugins/rdface/ Pablo N. Mendes, Christian Bizer: A Virtuous Cycle of Semantic Enhancement with DBpedia Spotlight