CGW 2010 - NLPN


Published on

NLPN (NLP Negotiations) System presented at Cracow Grid Workshop 2010 conference in Kraków, Poland

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

CGW 2010 - NLPN

  1. 1. FRAMEWORK FOR INTELLIGENT VIRTUAL ORGANIZATIONS (FIVO) Natural Language based Processing of Multilingual Contractsfor Virtual Organizations constitution Mikołaj Pastuszko, Bartosz Kryza, Renata Słota, Jacek Kitowski Institute of Computer Science, University of Science and Technology AGH Kraków, POLAND
  2. 2. AgendaBackground of the problemGoals and requirements of NLPN systemArchitecture of NLPN systemMain processing flow in NLPN systemTechnologies and tools used in NLPN systemExample of contract text analysis in NLPN systemFuture development proposals for NLPN system
  3. 3. Problem introductionAssumption  Organizations own resources that are expected to be shared within Virtual Organization  Conditions of cooperation are written down in form of the contract documentProblem  Contracts are written in natural language (e.g. Polish)  Automatization of the Virtual Organization management (FiVO) requires a formal and semantic form of the contract (ontology in OWL format)Solution  NLP-based Negotiations (NLPN) System: Translating natural language based contracts to ontologies in OWL format
  4. 4. Concept of NLPN system
  5. 5. Goals and requirementsSupport for multiple languages  English and Polish as a starting point  Easily extendable with support for another languagesOutput ontology in OWL format (FiVO requirement)  Ontology sturucture easily adjustableMinimalization of human (supervisor) assistanceFlexible mapping between text phrases and ontology entities  Human-readable and easily editable Contract DictionaryModularity  Easy orchestration for various applications
  6. 6. Data flow in NLPN system
  7. 7. Modular architecture of NLPN system
  8. 8. Contract text analysis1. Tokenization2. Sentence Splitting3. Morphological Analysis and POS Tagging4. Named Entities Recognition ● Gazetteer5. Contract Statemets Recognition ● Transducer + grammars
  9. 9. Technologies and toolsNLP tools  GATE – General Architecture for Text Engineering  Tokenizer ANNIE – A Nearly-New  Gazetteer Information Extraction System  OntoGazetteer  JAPE Transducer  JAPE grammars – Java Annotations Pattern Engine  LanguageTool  Sentence Splitter  Part-of-Speech Tagger  Disambiguator (tagger part)  Supports 20 languages including Polish (Morfologik library)
  10. 10. Technologies and toolsOntologies  Jena Semantic Web Framework library  Supports read and write in RDF/XML, N3 and N-Triples formats  Provides API for OWL and RDFConfiguration files  YAML format  SnakeYAML library
  11. 11. Example: Contract text analysisQoS StatementsCosta Rica Airlines should provide number of seats of Mercedes-Benz H6 equal to 54 and expected average velocitygreater than 60 km/h.Security StatementsTour Manager and Client should be able to book seats on Costa Rica Service.Penalty ClausesIn case of violation of Acela D45 trainset sharing conditions a notification should be sent to John Smith.Stwierdzenia QoSCosta Rica Airlines będzie świadczyć ilość miejsc siedzących dla Mercedes-Benz H6 wynoszącą dokładnie 54 iprzewidywaną prędkość średnią ponad 60 km/h.Stwierdzenia bezpieczeństwaTour Manager i Klient powinni być uprawnieni do rezerwowania miejsc poprzez Usługę Costa Rica.Klauzule kar umownychW przypadku niedotrzymania warunków świadczenia Acela D45 trainset powinno zostać wysłane powiadomienie doJohna Smitha.
  12. 12. Tokenization
  13. 13. Sentence Splitting
  14. 14. Morphological Analysis and POS Tagging
  15. 15. Named Entities Recognition
  16. 16. Contract Statements Recognition
  17. 17. Contract Statements Recognition
  18. 18. SummaryNLPN system:  Translates natural language based contracts to formal and semantic form of ontologies  Supports English and Polish  Easily extendable with another languages  Is modular  Ease of use in various applications  Is highly configurable  Contract Dictionary (including its structure)  Contract Ontology structure  Contract Statements forms  Configuration files for all components  Has broad perspectives for future development →
  19. 19. Future developmentDistributed Negotiations Environment  Negotiations Console More statement forms Statistic approach algorithms Noise correction (typo etc.)
  20. 20. The End Thank