1. FRAMEWORK FOR INTELLIGENT VIRTUAL ORGANIZATIONS (FIVO)
Natural Language based
Processing of Multilingual Contracts
for Virtual Organizations constitution
Mikołaj Pastuszko, Bartosz Kryza, Renata Słota, Jacek Kitowski
Institute of Computer Science, University of Science and Technology AGH
Kraków, POLAND
2. Agenda
Background of the problem
Goals and requirements of NLPN system
Architecture of NLPN system
Main processing flow in NLPN system
Technologies and tools used in NLPN system
Example of contract text analysis in NLPN system
Future development proposals for NLPN system
3. Problem introduction
Assumption
Organizations
own resources that are expected to be shared within Virtual
Organization
Conditions of cooperation are written down in form of the contract
document
Problem
Contracts
are written in natural language (e.g. Polish)
Automatization of the Virtual Organization management (FiVO) requires a
formal and semantic form of the contract (ontology in OWL format)
Solution
NLP-based Negotiations (NLPN) System:
Translating natural language based contracts to ontologies in OWL
format
5. Goals and requirements
Support for multiple languages
English and Polish as a starting point
Easily extendable with support for another languages
Output ontology in OWL format (FiVO requirement)
Ontology sturucture easily adjustable
Minimalization of human (supervisor) assistance
Flexible mapping between text phrases and ontology entities
Human-readable and easily editable Contract Dictionary
Modularity
Easy orchestration for various applications
8. Contract text analysis
1. Tokenization
2. Sentence Splitting
3. Morphological Analysis and POS Tagging
4. Named Entities Recognition
●
Gazetteer
5. Contract Statemets Recognition
●
Transducer + grammars
9. Technologies and tools
NLP tools
GATE – General Architecture for Text Engineering
Tokenizer
ANNIE – A Nearly-New
Gazetteer
Information Extraction System
OntoGazetteer
JAPE Transducer
JAPE grammars – Java Annotations Pattern Engine
LanguageTool
Sentence Splitter
Part-of-Speech Tagger
Disambiguator (tagger part)
Supports 20 languages including Polish (Morfologik library)
10. Technologies and tools
Ontologies
Jena Semantic Web Framework library
Supports read and write in RDF/XML, N3 and N-Triples formats
Provides API for OWL and RDF
Configuration files
YAML format
SnakeYAML library
11. Example: Contract text analysis
QoS Statements
Costa Rica Airlines should provide number of seats of Mercedes-Benz H6 equal to 54 and expected average velocity
greater than 60 km/h.
Security Statements
Tour Manager and Client should be able to book seats on Costa Rica Service.
Penalty Clauses
In case of violation of Acela D45 trainset sharing conditions a notification should be sent to John Smith.
Stwierdzenia QoS
Costa Rica Airlines będzie świadczyć ilość miejsc siedzących dla Mercedes-Benz H6 wynoszącą dokładnie 54 i
przewidywaną prędkość średnią ponad 60 km/h.
Stwierdzenia bezpieczeństwa
Tour Manager i Klient powinni być uprawnieni do rezerwowania miejsc poprzez Usługę Costa Rica.
Klauzule kar umownych
W przypadku niedotrzymania warunków świadczenia Acela D45 trainset powinno zostać wysłane powiadomienie do
Johna Smitha.
18. Summary
NLPN system:
Translates natural language based contracts to formal and
semantic form of ontologies
Supports English and Polish
Easily extendable with another languages
Is modular
Ease of use in various applications
Is highly configurable
Contract Dictionary (including its structure)
Contract Ontology structure
Contract Statements forms
Configuration files for all components
Has broad perspectives for future development →