Knowledge extraction from numerical data an abc


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Knowledge extraction from numerical data an abc

  1. 1. INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME & TECHNOLOGY (IJCET)ISSN 0976 – 6367(Print)ISSN 0976 – 6375(Online) IJCETVolume 4, Issue 2, March – April (2013), pp. 01-09© IAEME: Impact Factor (2013): 6.1302 (Calculated by GISI) © KNOWLEDGE EXTRACTION FROM NUMERICAL DATA: AN ABC BASED APPROACH Lalit Kumar1, Dr. Dheerendra Singh2 1 M.Tech Scholar, Department of CSE, SUSCET Tangori, Mohali, India 2 Professor and Head, Department of CSE, SUSCET, Tangori, Mohali, India ABSTRACT Fuzzy rule based systems provide a framework for representing & processing information in a way that resembles human communication & reasoning process. Two approaches can be found in the literature which is used for rule based generation; In Knowledge Driven Models the requisite rule base is provide by domain expert & knowledge engineers. In the Data Driven Models the rule base is generated from available numerical data. As the domain experts are difficult to find & knowledge extraction from the experts itself is difficult task the data driven modeling assume significance, One has to apply soft computing base methodology to generate rule base form data. Neural networks, genetic algorithm & particle swam optimization are some of the approaches [1]. Basic Artificial Bee Colony algorithm (ABC) has the advantages of strong robustness, fast convergence and high flexibility, fewer setting parameters, but it has the disadvantages premature convergence in the later search period and the accuracy of the optimal value which cannot meet the requirements sometimes [4]. KEY-WORDS: Artificial Bee Colony Algorithm, Fuzzy Membership Function, Sugeno System, Rule Based Generation. 1. INTRODUCTION Fuzzy systems are used to model highly complex and highly nonlinear systems and under the circumstances, the rule base extraction problem becomes NP hard problem. When the problem is very complex, application of classical methods turns out to be very expensive computationally. ABC is an example of how a natural process can be modeled to solve optimization problems [3]. The concept of mathematical model is fundamental to system analysis & design which requires representation of systems as functional dependence between interacting input & output variables conventionally, a mathematical model is constructed by analyzing input –output from the system. 1
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEMELITERATURE SURVE Since March 2012, Singh D. have studying the Solving Real Optimization Problemusing Genetic algorithm with Employed Bee (GAEB) A multimodal function has two ormore local optima. A function of variables is separable if it can be rewritten as a sum offunctions of just one variable. The search process for a multimodal function is difficult if thelocal optima are randomly distributed. This paper we proposed the Artificial Bee Colony (ABC) Algorithm as a solver forthe Shortest Common Super sequence problem. In 2011, Mustafa M. Noaman compared theresults obtained by applying Artificial Bee Colony (ABC) Algorithm [7] with the resultsobtained from applying other approaches that were proposed for solving the SCSP. TheArtificial Bee Colony (ABC) Algorithm provides a scalable solution and promising results. In this paper, real coded mutation and crossover operator is applied to the ABC afterthe employed bee phase and onlooker bee phase of ABC algorithm. In 2012, Manish Guptahave research some probabilistic criteria selected food source is altered by mutation operator.The experiments are performed on a job scheduling problem available in the literature. Thereis no specific value for mutation probability for which we can obtain best results for jobscheduling experiments. As future work we have the intention to apply other types ofsimulation operators and crossover operator in the ABC algorithm. The aim of this paper is to compare the performance of the ABC algorithm when usesdifferent selection strategies. In 2011, Malek Alzaqebah is concluded that ABC algorithmwith a disruptive selection strategy is able to produce better results when compared to otherselection strategies tested in this work. We believe the performance of the ABC algorithmcan be enhanced by applying a suitable mechanism to choose the neighborhood structurebased on the current solution in hand. Since 2010, Ivona B. presented the ABC algorithm for capacitated vehicle routingproblem. The twelve benchmark instances of small scale problems were tested. The resultswere compared to the best known results. Although the global optimality cannot beguaranteed, the performance of the algorithm is good and robust. It is noticed that algorithmcan be trapped in the local minimum for some benchmark instances. In the future work thealgorithm needs to be explored and tested for larger instances of the CVRP. The proposedapproach is also suitable for other combinatorial problems. Since 2005, D. Karaboga and his research group have been studying the ABCalgorithm and its applications to real world problems. Karaboga and Basturk haveinvestigated the performance of the ABC algorithm on unconstrained numerical optimizationproblems and its extended version for the constrained optimization problems and Karaboga etal. applied ABC algorithm to neural network training. In 2010, Hadidi et al. employed anArtificial Bee Colony (ABC) Algorithm based approach for structural optimization. In 2011,Zhang et al. employed the ABC for optimal multi-level thresholding MR brain imageclassification, cluster analysis, face pose estimation, and 2D protein folding.2. FUZZY SYSTEM The word fuzzy means “Vagueness”. Fuzziness occurs when the boundary of a pieceof information is not clear-cut. Fuzzy set theory is an extension of classical set theory whichallows the membership of the elements in the set in binary terms; a bivalent condition- anelement either belongs or does not belong to the set. But fuzzy theory permits the gradual 2
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME ,assessment of the membership of elements in a set, described with the aid of a membershipfunction valued in the real unit interval [0, 1]. A typical fuzzy based intelligent system has following modules as Fuzzification module, yInference Engine, Knowledge Base and Defuzzification module [1]. Out of these modules theknowledge or rule base is one of the most important parts of a fuzzy system as it provides t thenecessary intelligence to the system.2.1 Fuzzy Logic Based System Fuzzy systems are a class of systems belonging to knowledge based systems. In theclass of systems, the knowledge is represented in the form of a rule base of the system. Fuzzysystem can be represented with the help of block diagram. Any fuzzy system consists of fourmajor modules namely fuzzification module, inference engine, knowledge base anddefuzzification module [1].2.2 Fuzzificaton Module Fuzzification is the process of transforming the crisp input values to the correspondingvalues in fuzzy domain (fuzzy values) [1]. Figure 1: Block Diagram of fuzzy logic based system 3
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME KNOWLEDGE BASE This module contains the knowledge of the application domain and the procedural knowledge. It consists of a data base and linguistic control rule base [1]. INFERENCE ENGINE This module simulates the decision making capabilities of human brain. Based on input from fuzzifier, domain knowledge and set of control rules, the output decisions or the necessary control actions are evaluated in fuzzy domain [1]. It involves three steps:Rule Composition, Implication: and Aggregation: Depending upon type of composition operators and implication operators inference process is of three types: • Mamdani Style Inference. • Larsen Style Inference. • Sugeno Style Inference.2.3 Defuzzification Defuzzification performs the reverse operation of fuzzification process that is itconverts the fuzzified output of inference engine into corresponding crisp values [1]. Itperforms the following functions: A number of defuzzification methods are available. E.g. • Centre of Gravity/Centre of Area/Centroid Method • Centre of Sums. • Weighted Average. • Centre of Largest Area. • The process of design for fuzzy systems involves following steps: • Identify the input and output variables. • For these variables, generate membership functions and decide their shapes such as triangular, Z-type, S-type etc. • Generate rule base for the system. • Select the type of inference. • Select the type of aggregation. • Decide on the defuzzification technique and generate a crisp control action (defuzzification). For the system of small complexity Step 1 can be performed by the experts by includingall the available inputs. For the systems of higher complexity and it is not possible to take in toaccount all the inputs and one may be constrained to select only those inputs which havesignificant contribution to the overall output of the system. Some of the suggested proceduresin the literature are forward selection procedure, backward elimination procedure, best subsetmethod and few other statistical selection procedures [17]. Step 2 can be performed with the help of domain expert(s) if they are available, from thecommon sense or from the available numerical data. In case of numerical data is available forthese variables the membership functions generating using techniques like FCM, Neuralnetworks, GA etc. Step 3 involves the development of rule base. In the case of a knowledge based systemdevelopment, step 3 is performed by an expert whereas in case of data driven systemdevelopment certain computerized techniques are used to develop the rule base. 4
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEMEAs far as step 4 is concerned one can have hundreds of combinations of composition,implication and aggregation operators. For step 6, a large number of defuzzification techniques are available in the literature.Some of the defuzzification techniques are: centre of gravity (COG), centre of sum methods,first/last of maxima, Mean of maxima (MOM) [18-21].2.4 Problem Formulation Figure represents a sugeno type fuzzy system. From figure it is clear that such systemsconsists of 4 major modules i.e. fuzzifier, rule composition module (Fuzzy MIN operators),implication module and defuzzification module [21].The overall computed output, in the case of a sugeno type system, can be written as:Computed output=∑Wi*Ci / ∑ Wi…………….. (1)In order to proceed for system design we first divide the input universe of discourse asevidenced by data in to number of membership functions. For a two input system like the onegiven in figure the total number of rules in the rule base will be 3x2=6. In general if there areA inputs with B membership function each then the number of rules R can be written asfollows: R=BA. But these rules are due to combinations of membership functions of variousinputs and these are incomplete as we could have knowledge only about antecedent part andconsequents are yet unknown.Because for any set of inputs Wi are easily computed by fuzzifier and rule composingmodules, the RHS of output expression (1) can be evaluated if we could choose the propervalues for Ci.For a given data set of a system, Wi’s are known. Find the appropriate values of Ci such thatthe difference between computed output and the actual output as given in data is minimum.Ocomputed = (W1*C1+W2*C2+…..+Wn*Cn) / (W1+W2+…. +Wn)We compare this computed output with actual output as given in the data set and find theerror. Let the error be defined as:Error E= Actual output (As given in data set) - Computed output (As given in equation1) 5
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEMENow the whole problem of rulebase generation boils down to a minimization problem asstated below:Minimize objective function E E=OActual -OComputed …………………….. (2)Any minimization technique may not be applicable if the problem is very complex. We applysimple Ant colony optimization S-ACO algorithm to evaluate rule base.3. ARTIFICIAL BEE COLONY OPTIMIZATION In the ABC model, the colony consists of three groups of bees: employed bees,onlookers and scouts. It is assumed that there is only one artificial employed bee for eachfood source. In other words, the number of employed bees in the colony is equal to thenumber of food sources around the hive. Employed bees go to their food source and comeback to hive and dance on this area. The employed bee whose food source has beenabandoned becomes a scout and starts to search for finding a new food source. Onlookerswatch the dances of employed bees and choose food sources depending on dances. The mainsteps of the algorithm are given below: Initial food sources are produced for all employed bees REPEAT Each employed bee goes to a food source in her memory and determines a neighbor source, then evaluates its nectar amount and dances in the hive Each onlooker watches the dance of employed bees and chooses one of their sources depending on the dances, and then goes to that source. After choosing a neighbor around that, she evaluates its nectar amount. Abandoned food sources are determined and are replaced with the new food sources discovered by scouts. The best food source found so far is registered. UNTIL (requirements are met)In ABC, a population based algorithm, the position of a food source represents a possiblesolution to the optimization problem and the nectar amount of a food source corresponds tothe quality (fitness) of the associated solution. The number of the employed bees is equal tothe number of solutions in the population. At the first step, a randomly distributed initialpopulation (food source positions) is generated. After initialization, the population issubjected to repeat the cycles of the search processes of the employed, onlooker, and scoutbees, respectively.4. RESULT ANALYSIS The suggested approach has been applied for identification of fuzzy model for the rapidNickel-Cadmium (Ni-Cd) battery charger. The main objective of development of this chargerwas to charge the batteries as quickly as possible but without doing any damage to them.Input-output data consisting of 561 points, obtained through experimentation [22]. For thischarger the two input variables used to control the charging rate (ct) are absolute temperatureof the batteries (T) and its temperature gradient (dT/dt). Charging rates are expressed asmultiple of rated capacity of the battery. The input-output variables identified for rapid Ni-Cdbattery charger along with their universes of discourse are listed in Table 1. 6
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME Input Variables Minimum Value Maximum Value Temp. (T)[0C] 0 50 Temp. Gradient (dT/dt) [ 0C/sec] 0 1 Output Variable Charging Rate 0 4 (Ct) [A]Table1: Input and Output variables for rapid Ni-Cd battery charger along with their universes of discourseLet us assume that the temperature with the universe of discourse ranging from 0-50 degreecentigrade has been partitioned into 3 fuzzy sets namely temperature low, medium andtemperature high. The temperature gradient is partitioned into two fuzzy sets (membershipfunctions) namely low and high. Initially set the parameters of membership functions of inputvariables to any arbitrary value. Once fuzzification of the inputs is carried out, 6 combinations ofinput membership functions (3*2=6) representing 6 antecedents of rules are obtained. These 6rules from the rulebase for the system under identification. The rulebase is yet incomplete as foreach rule the consequent is need to be found out. From the given data set of table 1 there are only 5consequents that from where to choose one particular element as the consequent for a particularrule The specified set of consequents in this case are C1= µultrafast, C2 = µhigh , C3 = µmedium ,C4 = µlow and, C5 = µtrickle. The parameters of antecedent and consequents are chosen in such away so as to fulfill condition given by expression (2). Degree of compatibility of any input data setto rule represented by Wi can be easily computed using the following formulaW1 = min (µLOW (temperature), µLOW (temp_grad))This way all the Wi are evaluated, the right hand side of output expression (1) can be evaluated ifthe proper values for Ci ε {ULTRAFAST, MED, LOW, HIGH, TRICKLE} can be chosen.The ABC algorithm is implemented in C Language to select the values of consequents to satisfythe equation (2). It was observed that the algorithm was successfully able to generate the requiredrule base for the FLS shown in figure 3. With the application of rule reduction algorithm as givenin [Step 2- (d)] following set of rules are extracted by ABC. Figure 2: Extracted Rules 7
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME4. PERFORMANCE OF BATTERY CHARGER Mean Square Error (MSE) = 1/2N Σk=1 [y(k)-y’(k)]2 Where, y(k) = actual output y’(k) = computed output N = number of data points taken for model validation MSE = 0.30995. CONCLUSION This paper proposed an ABC based algorithm to enumerate rulebase for a sugeno typefuzzy logic based system.The length of the pathset represents the system output i.e. ∑Wi*Ci/ ∑ Wi. The difference between computed output and actual output as given in the trainingdata gives the error. This error is used to update the pheromone trail. Smaller the error more isthe amount of pheromone that is being deposited on the path. This allows artificial ants tochoose a path with higher pheromone deposit with higher probability. Finally all the antsfollow the path that has high pheromone deposit leading to shortest path. This leads togeneration of rule that produces minimum error.6. REFERENCES[1] Singh D., “Solving Real Optimization Problem using Genetic Algorithm withEmployed Bee” International Journal of Computer Applications (0975 – 8887) Volume 42–No.11, March 2012.[2] Mohd Afizi Mohd Shukran, “Artificial Bee Colony based Data Mining Algorithms forClassification Tasks” 2011.[3] Mustafa M. Noaman, “Solving Shortest Common Supersequence Problem UsingArtificial Bee Colony Algorithm” The Research Bulletin of Jordan ACM, ISSN, Volume II(III) PP-80.[4] Gupta M., “An Efficient Modified Artificial Bee Colony Algorithm for JobScheduling Problem” International Journal of Soft Computing and Engineering (IJSCE)ISSN: 2231-2307, Volume-1, Issue-6, January 2012[5] Inova B., “Artificial bee colony algorithm for the capacitated vehicle routingproblem” Proceedings of the European Computing Conference 2010. [6] Chang Jianghui, Zhao Yongsheng, Wen Chongzhu, “Research on Optimization ofFuzzy Membership function based on Ant Colony Algorithm,” Proc of the 25th ChineseControl Conference, Harbin, Aug, 2006.[7] Ashita S. Bhagade, “Artificial Bee Colony (ABC) Algorithm for Vehicle RoutingOptimization Problem” International Journal of Soft Computing and Engineering (IJSCEISSN: 2231-2307, Volume-2, Issue-2, May 2012[8] Malek Alzaqebah, “Artificial bee colony search algorithm for examinationtimetabling Problems” International Journal of the Physical Sciences Vol. 6(17), pp. 4264-4272, September, 2011[9] Adil Baykasoglu, “Artificial Bee Colony Algorithm and Its Application toGeneralized Assignment Problem” International Conference on Computational Intelligencefor Modeling, Control and Automation, Las Vegas. 8
  9. 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME[10] Marco Dorigo and Thomas Stuzzle, Ant Colony Optimization, Eastern EconomyEdition, PHI, 2005.[11] Arun Khosla, Shakti Kumar, KK Aggarwal, Jagatpreet Singh,”Particle SwarmOptimizer for fuzzy models IEEE Proc. on Fuzzy Systems, 2007[12] Marco Dorigo and Thomas Stuzzle, Ant Colony Optimization, Eastern EconomyEdition, PHI, 2005.[13] M. Galea and Q. Shen, “Fuzzy Rules from ant-inspired computation,”Proc. IEEE Int’l Conf. Fuzzy Systems, pp 1691-1696, 2004.[14] Bhalla P., “Fuzzy Rule base generation from Numerical Data using Ant colonyoptimization,” MAIMT-Journal of IT & Management. Vol. 1, No. 1 May-Oct, 2007,pp 33-47.[15] Chia-Feng J, H.J. Huang and C.M. Lu, “Fuzzy Controller Design by ant colonyoptimization,” IEEE Proc. on Fuzzy Systems, 2007.[16] Kumar S. “Introduction to Fuzzy Logic Based Systems”, Workshop on IntelligentSystem Engineering (WISE-2010), 2010.[17] Shakti Kumar, P.Bhalla and Amarpartap Singh, “Soft Computing Approaches toFuzzy System identification:A Survey”, IISN-2009,pp 402-411, 2009.[18] M.S. Abadeh, J. Habibi and E. Soroush, “Induction of Fuzzy classification systemsusing evolutionary ABC-based algorithms,” Proc. of the First Asia Int’l Conf. on Modelingand Simulation (AMS’07), 2007[19] Shakti K, P. Bhalla and S.Sharma, “Automatic Fuzzy Rule base Generation forIntersystem Handover using Ant Colony Optimization Algorithm,” International Conferenceon Intelligent Systems and Networks (IISN-2007), Feb 23-25, 2007, MAIMT, JagadhriHaryana, India, pp 764-773.[20] Shakti Kumar, “Rule base generation using ant colony optimization,” Proc. Of the oneweek workshop on applied soft computing (SOCO-2006), Haryana Engineering College,Jagadhri, July 2006.[21] Adil, B., Lale, Ö., and Pınar, T. 2007. Artificial Bee Colony Algorithm and ItsApplication to Generalized Assignment Problem. ISBN 978-3-902613[22] Andreas, W. 2003. The Shortest Common Supersequence Problem. ISBN978-3-90232[23] Barone, P., Bonizzoni P., Vedova, G.D., and Mauri, G. 2001. An approximationalgorithm for the shortest common Supersequence symposium on applied computing, 56-60.[24] Dervis, K. 2010. Artificial bee colony algorithm. Scholarpedia. 5(3):6915.[25] Dervis, K., and Bahriye, A. 2009. A comparative study of Artificial Bee Colonyalgorithm. Applied Mathematics and Computation, 214, 108–132.[26] G.Vasu, J. Nancy Namratha and V.Rambabu, “Large Scale Linear Dynamic SystemReduction Using Artificial Bee Colony Optimization Algorithm” International Journal ofElectrical Engineering & Technology (IJEET), Volume 3, Issue 1, 2012, pp. 145 - 155,Published by IAEME.[27] Lalit Kumar and Dr. Dheerendra Singh, “Solving Np-Hard Problem Using ArtificialBee Colony Algorithm” International journal of Computer Engineering & Technology(IJCET), Volume 4, Issue 1, 2013, pp. 171 - 177, Published by IAEME. 9