LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

868 views

Published on

The aim of the EU FP 7 Large-Scale Integrating Project LarKC is to develop the Large Knowledge Collider (LarKC, for short, pronounced “lark”), a platform for massive distributed incomplete reasoning that will remove the scalability barriers of currently existing reasoning systems for the Semantic Web. The LarKC platform is available at larkc.sourceforge.net. This is the first of two hand-ons that introduce participants to working with directly LarKC code.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
868
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

  1. 1. Cyc-Gate workflow Blaz Fortuna, Luka Bradesko Cycorp Europe, Slovenia
  2. 2. Goal <ul><li>Demonstrate reasoning over non-structured input data </li></ul><ul><li>Learn how to correctly annotate a new plug-in </li></ul><ul><li>Learn how to add a new plug-in to the platform </li></ul>
  3. 3. External tools useds <ul><li>GATE </li></ul><ul><ul><li>Information Extraction framework </li></ul></ul><ul><ul><li>Used here for extraction of named entities from articles </li></ul></ul><ul><li>ResearchCyc </li></ul><ul><ul><li>Common-sense knowledge base </li></ul></ul><ul><ul><ul><li>~300,000 concepts, 1.3M assertions </li></ul></ul></ul><ul><ul><li>Reasoning engine </li></ul></ul>
  4. 4. Pipeline diagram Research Cyc GATE Internet Query Identify Transform Select Reason Result
  5. 5. Example
  6. 6. Query PREFIX cyc: <http://www.cycfoundation.org/concepts/> SELECT ?company WHERE { ?company cyc:mentionedInArticle &quot; http://shodan.ijs.si:8080/GateServer/news.txt &quot; . ?company cyc:isa cyc:PubliclyHeldCorporation }
  7. 7. Identify <ul><li>Find links to html documents and retrieve them using ArticleIdentifier plugin. </li></ul><ul><ul><li>Returns a text document: </li></ul></ul>http://shodan.ijs.si:8080/GateServer/news.txt
  8. 8. Transform <ul><li>Use GATE to extract organizations </li></ul><ul><ul><li>Retruns SetOfStatements of style: </li></ul></ul>article-0 urn:hasUrl “http://shodan.ijs.si:8080/GateServer/news.txt &quot; company-0 urn:nameString “Microsoft” company-0 urn:mentionedInArticle article-0 company-1 urn:nameString “Ford” company-1 urn:mentionedInArticle article-0 … Query: … ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news.txt&quot; …
  9. 9. Select <ul><li>Select only the companies with corresponding concept in ResearchCyc KB </li></ul><ul><ul><li>company-0 -> #$MicrosoftInc </li></ul></ul><ul><ul><li>company-1 -> #$FordMotors </li></ul></ul><ul><li>Replace URIs with Cyc concepts </li></ul><ul><ul><li>cyc:mentionedInArticle -> #$mentionedInArticle </li></ul></ul><ul><li>Output: </li></ul>#$MicrosoftInc  #$ mentionedInArticle #$article-0 #$ FordMotors #$ mentionedInArticle #$article-0 …
  10. 10. Reason <ul><li>Reason </li></ul><ul><ul><li>Load the triples with Cyc concept names in ReasearchCyc KB </li></ul></ul><ul><ul><li>Transform SPARQL query to Cyc query </li></ul></ul><ul><ul><li>Execute and retrieve results </li></ul></ul>
  11. 11. Run the workflow on your computer! Main class: eu.larkc.core.Larkc VM arguments: -Xmx512m
  12. 12. Run SPARQL client <ul><li>In windows: </li></ul><ul><ul><li>Double-click SPARQLClient.jar </li></ul></ul><ul><li>In Linux: </li></ul><ul><ul><li>java –jar SPARQLClient.jar </li></ul></ul>
  13. 13. Run example query <ul><li>Execute query in SPARQL Client </li></ul><ul><li>Walk-through the output of the program </li></ul><ul><li>Go through the plug-ins’ .java files </li></ul>
  14. 14. Other interesting queries <ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news.txt&quot; . </li></ul><ul><li>?company cyc:isa cyc:PubliclyHeldCorporation } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news.txt&quot; . </li></ul><ul><li>?company cyc:isa cyc:SoftwareVendor } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?company cyc:isa cyc:SoftwareVendor } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news.txt&quot; . </li></ul><ul><li>?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?company cyc:isa cyc:Business } </li></ul>
  15. 15. Other interesting queries <ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?company cyc:makesProductType cyc:CellularTelephone } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?company cyc:makesProductType cyc:CellularTelephone . </li></ul><ul><li>?company cyc:stockTickerSymbol ?ticker } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?program cyc:programAuthor ?company } </li></ul><ul><li>PREFIX cyc: <http://www.cycfoundation.org/concepts/> </li></ul><ul><li>SELECT ?company WHERE </li></ul><ul><li>{ ?company cyc:mentionedInArticle &quot;http://shodan.ijs.si:8080/GateServer/news2.txt&quot; . </li></ul><ul><li>?competitor cyc:competitors ?company . </li></ul><ul><li>?competitor cyc:makesProductType cyc:CellularTelephone } </li></ul>
  16. 16. Plug-in SAWSDL description <ul><li><wsdl:description> </li></ul><ul><li><!-- COMMON TO ALL SELECTERS --> </li></ul><ul><li><wsdl:interface name=&quot;identifier&quot; </li></ul><ul><li>sawsdl:modelReference=&quot;http://larkc.eu/plugin#Identifier&quot;> </li></ul><ul><li></wsdl:interface> </li></ul><ul><li><wsdl:binding name=&quot;larkcbinding&quot; type=&quot;http://larkc.eu/wsdl-binding&quot; /> </li></ul><ul><li><!-- SPECIFIC TO THIS IDENTIFIER --> </li></ul><ul><li><wsdl:service </li></ul><ul><li>name=&quot;urn:eu.larkc.plugin.identify.article.ArticleIdentifier&quot; </li></ul><ul><li>interface=&quot;identifier” </li></ul><ul><li>sawsdl:modelReference=&quot;http://larkc.eu/plugin#ArticleIdentifier&quot; > </li></ul><ul><li><wsdl:endpoint </li></ul><ul><li>location=&quot;java:eu.larkc.plugin.identify.article.ArticleIdentifier&quot; /> </li></ul><ul><li></wsdl:service> </li></ul><ul><li></wsdl:description> </li></ul>
  17. 17. Plug-in ontology <ul><li>@prefix larkc: <http://larkc.eu/plugin#> . </li></ul><ul><li>@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . </li></ul><ul><li>@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . </li></ul><ul><li>larkc:ArticleIdentifier </li></ul><ul><ul><li>rdf:type rdfs:Class ; </li></ul></ul><ul><ul><li>rdfs:subClassOf larkc:Identifier ; </li></ul></ul><ul><ul><li>larkc:hasInputType larkc:SPARQLQuery ; </li></ul></ul><ul><ul><li>larkc:hasOutputType larkc:NaturalLanguageDocument . </li></ul></ul>
  18. 18. Scripted decider <ul><li>Pipeline pipeline = new Pipeline(); </li></ul><ul><li>pipeline.addPlugIn( new URIImpl( &quot;urn:eu.larkc.plugin.identify.article.ArticleIdentifier&quot; )); </li></ul><ul><li>pipeline.addPlugIn( new URIImpl( &quot;urn:eu.larkc.plugin.transform.gate.GateTransformer&quot; )); </li></ul><ul><li>pipeline.addPlugIn( new URIImpl( &quot;urn:eu.larkc.plugin.select.cycselecter.CycSelecter&quot; )); </li></ul><ul><li>pipeline.addPlugIn( new URIImpl( &quot;urn:eu.larkc.plugin.reason.cycreasoner.CycReasoner&quot; )); </li></ul><ul><li>try { </li></ul><ul><li>pipeline.start(theQuery); </li></ul><ul><li>} catch (Exception e) { </li></ul><ul><li>// error </li></ul><ul><li>} </li></ul><ul><li>return (VariableBinding)pipeline.take(); </li></ul>
  19. 19. Write a new plug-in <ul><li>Create new project </li></ul><ul><ul><li>New Folder </li></ul></ul><ul><ul><li>Link bin directory </li></ul></ul><ul><ul><li>Make source directory </li></ul></ul><ul><ul><li>Add libraries </li></ul></ul><ul><li>Prepare code: </li></ul><ul><ul><li>Copy-paste GateTransformer.Java </li></ul></ul><ul><ul><li>Rename it to SimpleNamedEntitiyExtractor </li></ul></ul><ul><ul><li>Insert code available in SimpleNamedEntitiyExtractor.txt </li></ul></ul><ul><li>Prepare/update meta-data files </li></ul><ul><ul><li>SimpleNamedEntitiyExtractor.wsdl </li></ul></ul><ul><ul><li>SimpleNamedEntitiyExtractor.rdf </li></ul></ul><ul><li>Update CycGateDecider </li></ul><ul><li>Clean, Build and Run! </li></ul>

×