Using ontology for natural language processing


Published on

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Using ontology for natural language processing

  1. 1. Using ontology for natural language processing Cr˘c˘oanu Constantin Sergiu a a January 21, 2012 Abstract Natural language processing is represented by a set of methods and techniques used to mediate the human-machine communication. To make this possible we have to define a communication format and software able to analyse, understand and give appropriate response. For the commu- nication level a formal representation of the knowledge it is needed and this can be represented by ontology.Keywords: natural language processing, ontology, artificial intelligence 1
  2. 2. 1 IntroductionOntology is defined as representing knowledge in a formal model and is based onconceptualization; conceptualization of a knowledge area must be understood asobjects, concepts plus other entities that are assumed to exist and the relationsthat exist among them.Depending on the purpose, context, coverage and the way that are used, on-tology can be general, middle or specific.Natural language processing is considered to be a sub-field of artificial intelli-gence and has the main goal of making systems smart enough to make inferencesand respond with correct and complete answer when requested by a user. Usingontologies in natural language processing is a relatively new part of artificial in-telligence.An artificial neural network is a computational model inspired by biological neu-ral networks that is able to learn and it is used to solve problems that need ananswer based on previous experience of the system.2 TechnicalThe NLP view described in this article uses a conjunction of general and specificontologies. Basically there are two methods to create an ontology: from scratchor using already existing ontologies. There are at least three ways of combiningontologies: inclusion, restriction and refinement.Our approach has three parts: • a general ontology based on lexemes is needed. Suggested Upper Merged Ontology (SUMO) is currently the best candidate because its domain forms the largest formal public ontology in existence today and it is the only for- mal ontology that has been mapped to all of the Wordnet lexicon.WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. • a middle or specific level ontology must be used. 2
  3. 3. • and the program that is a mediator between human language and machine language by using the two types of ontologies.3 ArchitectureThe next diagram shows the relation between human language, ontologies andnatural language processing.Human knowledge is mapped to a middle or specific ontology. This ontologywill use general ontology when needed. The NLP chooses the correct ontologyfor the current domain and apply the corresponding algorithm. Then the knowl-edge is translated into Machine Language. It is well agreed today that NLPhas not yet reached its goal, to make machine understand human language bydrawing inferences, but for now we receive an answer to our current request andsometimes computers seem smart.4 How to be usedThis section describes two concrete cases of using ontologies with natural lan-guage processing.First example is using ontologies for automatically translating from one languageto other(for example from German to English). For this four ontologies can beused : 3
  4. 4. • ontology mapped for language A lexicon • ontology with grammar rules for lexicon A • ontology mapped for language B lexicon • ontology with grammar rules for lexicon B • ontology containing a dictionary that maps language A to BThese ontologies are merged to form a new ontology. An algorithm to combineand use the resulting ontology is also needed. The program that implements thisalgorithm would have as an input a text in language A and as an output thecorresponding text in language B.The resulting ontology can be completed by using artificial neural network :after each training translation performed the system can learn new rules thatmust be taken into account.A second example for using ontology is an automatic speech recognition andgeneration system. This system can be used for example in automotive industryor in banking services. One ontology is needed to generate two grammars, onefor speech recognition and the second one for speech generation. The systemputs questions to the user, gives or receives answers and executes commands.5 How to extend ontologiesTo completely define our solution for mediating human-machine communicationwe need to define also a way of extending ontologies. According to [JingshanHuang et al.] for defining the semantics of an ontology concept three elementsmust be determined: concept’s name, properties and relationships.The proposed solution for extending ontologies is based on artificial neural net-work. Every ontology it is represented by a directed graph G. Every graph itis represented in a plan of it’s own, nodes are horizontal connected in the sametype of ontology(general, middle or specific), but these graphs are also verticalconnected, specific ontology is based on middle ontology and middle ontologyuses general ontology. 4
  5. 5. Figure 1: there are three plans, one for the general ontology, one for the middle ontologies and one for specific ontologiesGraph description:G = {VG , EG }VG is the nodes set; every node has two views : • it represents a concept: it has a name • it is a perceptron: all inferences about this concept are represented by the formula Σi=1,n xi wi ck where xi is the input for the ith input, wi is the weight of this input and ck is the context; ith , wi ∈ [0, 1] and ck ∈ {0, 1}; the learning rule is wi = wi + [T − A] ∗ xi and T is the correct result that the neuron should have shown, A is the actual output of the neuron; ck = 1 only if there is a number of inferences ≥ Θ that influence each otherEG is the edges set; every edge represents a property or a relationship.Properties and relationships are the equivalent of inferences which are groupedinto subsets that influence each other.Until now we have defined an artificial neuronal network but the main purpose 5
  6. 6. is to be able to extend the ontology and this is done by using training sessions.After each session the knowledge grows and training can stop when the trainedsystem is smart enough for a specified set of requirements.6 ConclusionsNatural language processing can successfully use ontologies to mediate human-machine communication. The final goal for this research domain is to transformnatural language processing into natural language understanding by the machine.A complete natural language understanding must be able to: • Paraphrase an input text • Translate text into another language • Answer questions about the contents of the text • Draw inferences from the textThe first three objectives have relatively been accomplished but the fourth re-mains only a concept that might become reality if NLP uses ontologies for con-structing inferences.References [1] Dr. Elizabeth D. Liddy, Natural Language Processing Encyclopedia of Li- brary and Information Science: Second Edition DOI: 10.1081/E-ELIS- 120008664 [2] Dario Bianchi and Agostino Poggi, Ontology Based Automatic Speech Recognition and Generation for Human-Agent Interaction, University of Modena and Reggio Emilia, Italy, June 14-June 16, ISBN: 0-7695-2183- 5 [3] Ru-Yng Chang, Chu-Ren Huang, Feng-Ju Lo, Sueming Chang, From Gen- eral Ontology to Specialized Ontology: A study based on a single author historical corpus [4] Jingshan Huang, Jiangbo Dang, Jose M. Vidal, Michael N. Huhns, Ontology Matching Using an Artificial Neural Network to Learn Weights 6
  7. 7. [5] Teaching Activity of Sabin-Corneliu Bura, ~busaco/teach[6][7][8][9][10][11] 2004.47 7