Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Creating Knowledge out of Interlinked Data         NIF – NLP Interchange Format                                           ...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format           Problem:            • Currently NLP softw...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format           Overview:            • NLP tools can be i...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format      • First Challenge: Representing Strings in RDF...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format                            5LOD2 Event . 06.09.2010...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format                                   Example URIs for ...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format      • First Challenge: Representing Strings in RDF...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format      • URIs are used to integrate output. RDF merge...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format      • Second challenge: Output of each layer is re...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format                            10LOD2 Event . 06.09.201...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format                            11LOD2 Event . 06.09.201...
Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format                            12LOD2 Event . 06.09.201...
Creating Knowledge out of Interlinked DataWorkplan      • EU Deliverable almost finished      • Integration of SnowballSte...
Creating Knowledge out of Interlinked DataFuture      • NIF allows to represent NLP output using Knowledge Representation ...
Creating Knowledge out of Interlinked DataReasons for Open Data      • Horváth et. al. (ILP 2009): „A Logic-Based Approach...
Creating Knowledge out of Interlinked Data         Thank you for your attention!LOD2 Presentation . 02.09.2010 . Page     ...
Upcoming SlideShare
Loading in …5
×

NIF - NLP Interchange Format

2,550 views

Published on

Published in: Technology, Education
  • Be the first to comment

NIF - NLP Interchange Format

  1. 1. Creating Knowledge out of Interlinked Data NIF – NLP Interchange Format Sebastian Hellmann AKSW, Universität LeipzigLOD2 Presentation . 02.09.2010 . Page http://lod2.eu
  2. 2. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format Problem: • Currently NLP software is organized in pipelines • Integration is done „hard-wired“ – For each tool and each framework an adapter has to be created (n*m) • Difficult to exchange single components 2Open Linguistics@OKCon 30.6.2011 2 http://lod2.eu
  3. 3. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format Overview: • NLP tools can be integrated via a common output format (Common pattern in Enterprise Application Integration) • For each tool a wrapper needs to be created, that reads NIF and produces NIF • The combination of tools can be adhoc, i.e. it is not a pipeline that needs to be configured • Multi-layer and overlapping annotations are possible • Ontologies provide interfaces for each layer and for applications 3Open Linguistics@OKCon 30.6.2011 3 http://lod2.eu
  4. 4. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format • First Challenge: Representing Strings in RDF • How to give a part of a document or text an identifier (URI)? • What properties can such URIs have? 4Open Linguistics@OKCon 30.6.2011 4 http://lod2.eu
  5. 5. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format 5LOD2 Event . 06.09.2010 . Page 5 http://lod2.eu
  6. 6. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format Example URIs for annotating „Semantic Web“ 6Open Linguistics@OKCon 30.6.2011 6 http://lod2.eu
  7. 7. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format • First Challenge: Representing Strings in RDF • How to give a part of a document or text an identifier (URI)? • What properties can such URIs have? 7Open Linguistics@OKCon 30.6.2011 7 http://lod2.eu
  8. 8. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format • URIs are used to integrate output. RDF merges naturally, if the URIs are the same (or convertible using a certain recipe) 8Open Linguistics@OKCon 30.6.2011 8 http://lod2.eu
  9. 9. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format • Second challenge: Output of each layer is required to be stable. • Components and layers can be interchanged • OLiA provides an ontological interface 9Open Linguistics@OKCon 30.6.2011 9 http://lod2.eu
  10. 10. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format 10LOD2 Event . 06.09.2010 . Page 10 http://lod2.eu
  11. 11. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format 11LOD2 Event . 06.09.2010 . Page 11 http://lod2.eu
  12. 12. Creating Knowledge out of Interlinked DataNIF – NLP Interchange Format 12LOD2 Event . 06.09.2010 . Page 12 http://lod2.eu
  13. 13. Creating Knowledge out of Interlinked DataWorkplan • EU Deliverable almost finished • Integration of SnowballStemming and the Stanford Parser • Next step: Integration of Knowledge Extraction tools (Zemanta, DBpedia Spotlight, Alchemy, OpenCalais) • Web Service that read NIF and Output NIF • Google Code Project: http://code.google.com/p/nlp2rdf/ 13Open Linguistics@OKCon 30.6.2011 13 http://lod2.eu
  14. 14. Creating Knowledge out of Interlinked DataFuture • NIF allows to represent NLP output using Knowledge Representation Formalisms (RDF/OWL) • It is possible to mix it with other Knowledge (e.g. Wikipedia/DBpedia) • Good foundation to optimize machine learning: • Choose the best algortihms • Choose the best data 14Open Linguistics@OKCon 30.6.2011 14 http://lod2.eu
  15. 15. Creating Knowledge out of Interlinked DataReasons for Open Data • Horváth et. al. (ILP 2009): „A Logic-Based Approach to Relation Extraction from Texts“ • POS-Tags and Dependency Trees in First-Order-Logic • ILP Machine Learning Approach • TIDES Extraction (ACE) 2003 Multilingual Training Data • closed licence • about 3000 US $ • Barrier for reproduction of results • Authors could send me a (p)(r)e-print, but not a copy of the benchmarkTM 15Open Linguistics@OKCon 30.6.2011 15 http://lod2.eu
  16. 16. Creating Knowledge out of Interlinked Data Thank you for your attention!LOD2 Presentation . 02.09.2010 . Page http://lod2.eu

×