Nature Inspired Models And The Semantic Web


Published on

In this paper we present a series of nature inspired models used as alternative solutions for Semantic Web concerns. Some of the methods presented in this article perform better than classic algorithms by enhancing response time and computational costs. Others are just proof of concept, first steps towards new techniques that will improve their respective field. The intricate nature of the Semantic Web urges the need for faster, more intelligent algorithms and nature inspired models have been proven to be more than suitable for such complex tasks.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Nature Inspired Models And The Semantic Web

  1. 1. Nature Inspired Models and the Semantic Web Stefan Ceriu, Stefan Prutianu, Faculty of Computer Science, „Al. I. Cuza“ University, Iasi, Romania { stefan.ceriu, stefan.prutianu } Abstract. In this paper we present a series of nature inspired models used as alternative solutions for Semantic Web concerns. Some of the methods presented in this article perform better than classic algorithms by enhancing response time and computational costs. Others are just proof of concept, first steps towards new techniques that will improve their respective field. The intricate nature of the Semantic Web urges the need for faster, more intelligent algorithms and nature inspired models have been proven to be more than suitable for such complex tasks. Keywords: nature inspired models, semantic web, ontology, alignment, rdf, query, soft computing, genetic algorithms, artificial neural networks, kohonen, swarm intelligence 1 Introduction The Semantic Web is a new paradigm for the Web in which semantic information is associated with current existing data in order to make it accessible to machines. The goal is to allow autonomous agents to rapidly access content by searches based on meaning instead of classic syntactic methods. Nature inspired methods and natural computing are software models that follow the steps of natural phenomenon and are often used in solving complex problems. Models like Artificial Neural Networks and Genetic Algorithms are fast, reliable and scalable and have been successfully adopted in dealing with a vast amount of software predicaments. The large quantity of data available and discrepancies between interpretations make the Semantic Web a very complex domain for which new and more ingenious methods have to be devised in order to have it progress and improve. We will further investigate how Nature Inspired Models are being used in the current context of the Semantic Web and what advantages and changes they bring to this domain. This paper is organized in two main chapters based on the classic classification of nature inspired models: Evolutionary Computing and Artificial Neural Networks. In each chapter we will present characteristic algorithms and their application in different aspects of the Semantic Web.
  2. 2. 2 Evolutionary Computing Evolutionary computing represents a collection of methods inspired from the Darwinian evolutionary system and natural models. Their main characteristic is that they auto-adapt to different problem constraints thus being able to discover and take advantage of instance specific properties. By directly working with binary representations of the solutions and not requiring a mathematical model, evolutionary algorithms requires fewer approximations and return better results. Because they have been successfully used in optimization, machine learning and complex system design, it is found that evolutionary algorithms can be also applied to other domains more or less successfully. We will further look upon how Evolutionary Computing can aid or solve issues in the context of the Semantic Web. 2.1 Optimizing Ontology Alignments by Using Genetic Algorithms [1] Ontologies are systematic representation and specifications about some domain or parts of it. It provides a common vocabulary on top of which we can define a world by the objects that it contains and the relations between them. One of the key features of ontologies is that anyone can model their own knowledge without being forced to respect any pre-established standards. The issue at hand is that it is very costly for organizations to reach a common denominator and even if they do the result won’t be customized to the needs of every party involved. People would try to bring extensions and additions to the ontology and errors and incompatibilities will arise. Ontology alignment is a way by which we can find correspondences between the various modeled concepts, fix heterogeneity issues and use them as a whole. Although there are many techniques that specifically deal with ontology matching through data analysis, machine learning, language engineering .etc [2], the problem is complex and many of these algorithms cannot cope with the shear amount of data available in some cases. In their paper [1], Martinez-Gil et al. propose a new way to deal with this issue by implementing an ontology alignment solution based on genetic algorithms, a subclass of evolutionary computing. Their solution is thus able to search a high dimensional space and provide an efficient mechanism for matching different sets of ontologies. Like in any other genetic algorithm this approach uses an encoding of the solution candidates and a fitness function which returns the quality of an individual. In this case several parameters are encoded into a single chromosome by using a function which converts bit representations into floating-point sub-unitary numbers. The fitness function uses one of the parameters returned by an alignment evaluation method (precision, recall, f-measure) and is capable of producing better end results by focusing on a single characteristic. Unfortunately we have no more information on how exactly is this method implemented but judging from the results that the authors present in their paper it is capable of performing as good as all the other existing algorithms but has an
  3. 3. advantage when working with large data sets, strength directly given by working with this type of nature inspired model. It manages to reach convergence in only five consecutive generations and find the optimal alignment in most of the cases. 2.2 Genetic Algorithms for RDF Query Path Optimization [3] Another issue present in the current context of the Semantic Web is that information is scattered and there is yet an algorithm capable of efficiently querying multiple heterogeneous sources and returning more relevant results. The execution time of this type of algorithm is mainly given by the order in which the various parts of the query are evaluated. Research in this field has resulted in the iterative improvement algorithm followed by simulated annealing. This is referred to as the two-phase optimization algorithm. We know that in some cases like the circuit partitioning problem and the traffic routing problem genetic algorithms perform better than simulated annealing [4] and this is the main reason why Alexander Hogenboom et al. propose a variant of the two-phase optimization algorithm in which the simulated annealing part is replaced by a genetic algorithm the main goal being a more rapid response time. ”Entirely new queries should be optimized and resolved real-time.”[3] Large queries can be seen as a series of smaller queries composed by join operations. Optimizing the order in which these joins arise directly improves the overall response time. The method presented in this article associate a cost to each join based on the cardinality of each operand. These costs directly influence the fitness function as the solution with the lowest cost has the highest ranking. The chromosomes are encoded using a number encoding scheme for bushy trees [5] which is efficient and permits fast crossover operations. This algorithm joins concept from an ordered list together saving the result on the position of the first concept. After each iteration the positions of the concepts are added to the encoding of the current chromosome. The results presented denote that the genetic algorithm can perform better than the two-phase algorithm when it comes to solution quality, consistency and execution time needed. The algorithm performs better as the complexity of the solution space increases but if the solution space is simple then executing the query might be faster. Yet another example where nature inspired models outperforms other algorithms and aid the development of the Semantic Web. 2.3 Anytime Query Answering in RDF through Evolutionary Algorithms [6] As the Semantic Web is sometimes imperfect or too large to work with as a whole answering queries through SPARQL might not produce the best results. An approximation based method could prove to be useful when dealing with these kind of problems and provide better and faster results. Oren et al. [6] propose a technique of this type that they hope “will be useful in many applications and even essential for others”. Their evolutionary computing based solution encodes queries as sets of constraints and finds a solution by addressing the assignment which validates the implication between the query graph and the data graph. Although an exhaustive search for the assignment could be used
  4. 4. they resort to nature inspired methods like mutation and crossover and use the number of satisfied constraints and Bloom filters as the fitness function. The Bloom filters contain a compressed version of the data graph and are used because they provide fast approximate access to the data and evolutionary techniques. Four genetic operators are used: parent selection, recombination, mutation and survivor selection. Each one of them was selected after a series of experiments as to which will perform better under the given issue. Overall, this approach is faster in finding approximate answers to the queries and given the fact that the Semantic Web still has its problems this type of approximate calculation could prove useful and even better than traditional methods. 2.4 Semantic Web Reasoning by Swarm Intelligence [7] Swarm intelligence is a technique in which groups of agents work together towards reaching a common goal. Each of them respects simple rules and although there is no centralized control they manage to self-organize and interact with each other and with the environment to reach some degree of global artificial intelligence. This technique is inspired by models found in nature like ant and termite colonies, bird flocks, fish schools etc. Semantic Web reasoners should be able to access all the data available on the web and be capable of accessing it in any format (RDF, XML, OWL .etc) but as the data is constantly changing this scope might be harder to reach. Swarm intelligence has the property that it is decentralized and self-organizing being able to make use with ease of new data or old modified data. It is also robust and scalable which make it a candidate technique in obtaining optimized reasoning performance for the Semantic Web. In their paper K.Dentler et al. propose a new method of reasoning the web in which a swarm of agents traverse an RDF graph and each one of the agents represents a reasoning rule. The RDF graph is looked at as an interconnected network where each node is an object and each edge is a property. By walking the edges of the graph they are able to distribute the relationship rules to the individuals, each one of them applying one single rule. When they find a path that respects the condition of that rule they locally add a new derived triple to the graph. Traditionally this approach is implemented by indexing all the resulted triples and merging the results of multiple queries. In the method presented no indexing is used, technique which leads to efficiency improvements but also leads to redundancy as the agents have to sometime follow unnecessary paths. Even though the experiments in [7] are just a proof of concept the basic idea that swarm intelligence can significantly improve the way Semantic Web reasoners work remains. The distributivity, robustness and scalability of this nature inspired model and the results of these experiments are reason enough to continue research in this direction.
  5. 5. 3 Artificial Neural Networks Neural networks are inspired by the configuration of biological neural networks and are essentially a multi-layered hierarchic structure where any two processing units can communicate. Each neuron is represented by a node in the network and each node holds a primitive function. The way in which these functions are composed is strictly given by topology of the network. If weights are associated to each connection in the network then different weight values produce various results. Artificial neural networks have been successfully used in function approximation, pattern and sequence recognition, economic applications such as market trading and bankruptcy predictions, data processing, medical diagnosis and others. They are capable of adaptive learning, self-organizing and real time operations and represent a reliable and robust system used in a vast number of disciplines. We will continue by studying applications of the artificial neural networks for the Semantic Web and the way in which they improved or changed this domain. 3.1 Ontology Matching Using an Artificial Neural Network to Learn Weights [10] Similar to genetic algorithms, artificial neural networks are another nature inspired model which can be used to match ontologies. The majority of the ontology alignment techniques are based on either rule-based or learning-based models both of which have shortcomings. In the rule-based approach the schemas to be aligned are represented as graphs or trees which require a significant number of traversals. The learning based model requires much computational effort in order to train its learners. Both of these issues become more important as the system is needed in working with large schemas and dynamic environments. In addition to these efficiency disadvantages the second model also requires human intervention as to setting the weights of different aspects within the ontology. This is an error-prone practice and requires effort in generating good, relevant data. The authors of [10] propose a new artificial neural network based technique in which the weights stated above are learned from the ontology schema instead of being given a priori. The algorithm creates a tree-dimensional vector, one dimension for each concept taken into consideration (name, properties and relationships), which are afterwards compared using different similarity methods. For the name concepts their string values are compared directly and a sub- unitary value is returned as to the degree of similarity. If the two strings are equal or synonyms (WordNet lookup) then the function will return 1. Otherwise the similarity is calculated using a simple equation involving Levenshtein distance and the string length. The concept properties similarity is given by the number of properties matched between the two ontologies. Two properties are aligned when their data type is the same or their name similarity is above a preset value. Similarity in relationships takes into consideration all the ancestors a concept can have, up to the root. It is assumed that both ontologies are a subclass of “thing”. The overall relationship similarity will be given by the maximal value found by pair wise comparing all the ancestors.
  6. 6. These similarity functions will represent the input to a 3 by 1 neural network which will calculate the overall equivalence of two concepts. The weights of the edges of this network will be, at first, randomly generated, and then some concepts in the first ontology will be manually matched to aid learning. The experiments made by the authors reveal that this approach has an 85% precision rate with minimum human intervention. The results are encouraging and provide proof that artificial neural networks can be successfully used for aligning ontologies. 3.2 Text-Based Ontology Enrichment Using Hierarchical Self-organizing Maps [13] Self organizing maps are a type of artificial neural networks trained using unsupervised learning. They were invented by Professor Teuvo Kohonen and are capable of transforming a multi-dimensional dataset into smaller one (usually two- dimensional). They are also called Kohonen Networks or Kohonen Maps. Input data for a self organizing map is not labeled in any way but is instead clustered based on properties identified during the training process. Each neuron contains an item from the data set and has an associated weight vector which is adjusted accordingly during training. The advantage to Kohonen Networks is that they can accurately map the whole input space based on these weight vectors. E.Chifu and A.Letia present a way in which Kohonen Maps can be used as a means of enriching ontologies by extracting new concepts from domain related documents. Data mined from these documents passes through a “symbolic-neural translation” phase which represents the initial state of the neural network. In order for the network to correctly classify concepts weight vector are required for each neuron. These vectors are calculated based on how many occurrences a term has had during the parsing of the documents or based on document category histogram. Even though ontology enrichments systems are hard to compare because most of them use different domains and ontologies, the result published on the method presented suggest that it is suitable for this kind of operation. 3 Conclusions We have presented a series of nature inspired models, each with its advantages and innovations, that changed the way Semantic Web problems are solved. Some of the methods presented in this article performed better than classic algorithms by enhancing response time and computational costs. Others were just proof of concept, first steps towards new techniques that will improve their respective field. Nature inspired models have proven to be useful in a relatively new domain empowering the strengthening yet again the idea that they are strong, efficient models.
  7. 7. References 1. Martinez-Gil, J., Alba, E., Aldana-Montes, J.F.: Optimizing Ontology Alignments by Using Genetic Algorithms 2. Euzenat, J. et al.: D2.2.3: State of the art on ontology alignment 3. Hogenboom, A., Milea, V., Frasincar, F., Kaymak, U.: Genetic Algorithms for RDF Query Path Optimization 4. Kohonen, J.: A brief comparison of simulated annealing and genetic algorithm approaches – 5. Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and Randomized Optimization for the Join Ordering Problem 6. Oren, E., Gueret, C., Schlobach, S.: Anytime Query Answering in RDF through Evolutionary Algorithms 7. Dentler, K., Schlobach, S., Guéret, C.: Semantic Web Reasoning by Swarm Intelligence 8. Bry, F., Marchiori, M.: Reasoning on the Semantic Web: Beyond Ontology Languages and Reasoners 9. Yang Liu, Passino, K.M.: Swarm Intelligence: Literature Overview 10. Huang, J., Dang, J., Vidal, J.M., Huhns, M.N.: Ontology Matching Using an Artificial Neural Network to Learn Weights 11. Bagheri Hariri, B., Abolhassani, H., Sayyadi, H.: A Neural Networks Based Approach for Ontology Alignment 12. Algergawy, A., Schallehn, E., Saake, G.: A Sequence-based Ontology Matching Approach 13. Chifu, E.S., Letia, I.A.: Text-Based Ontology Enrichment Using Hierarchical Self Organizing Maps