New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules
Upcoming SlideShare
Loading in...5
×
 

New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules

on

  • 1,679 views

 

Statistics

Views

Total Views
1,679
Views on SlideShare
1,652
Embed Views
27

Actions

Likes
1
Downloads
30
Comments
0

2 Embeds 27

http://www.albertorriols.net 26
http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules New Challenges in Learning Classifier Systems: Mining Rarities and Evolving Fuzzy Rules Presentation Transcript

  • New Challenges in Learning Classifier g g Systems: Mining Rarities and Evolving Fuzzy Rules Student: Albert Orriols-Puig Supervisor: Ester Bernadó-Mansilla Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle Universitat Ramon Llull
  • Background GRSI has been researching on machine learning and data mining Especially focused on data classification Research aims at Improving learning methods Applying learning methods to real-world applications Application of LCS to classification problems is one of the main research lines LCS are appealing because the mine streams of examples Many applications make the data available in streams Important challenges need to be addressed to deal with complex applications Slide 2 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Background General schema of LCSs Introduced by Holland Environment Sensorial S il Action Feedback state Apportionment of credit algorithms Online rule evaluator Learning L i Classifier 1 Cl ifi Any Representation AR t ti XCS: Q-Learning (Sutton & Barto, 1998) Classifier 2 Classifier Uses Widrow-Hoff delta rule production rules, System genetic programs, Classifier n perceptrons, t SVMs Rule evolution Evolutionary Typically, a GA (Holland, 75; Goldberg, 89) Algorithm applied to the population population. Slide 3 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • When this Work Started In 2004, when Michigan-style LCSs were reaching maturity First successful implementations (Wilson, 95; Wilson, 98) Many other derivations YCS, UCS, XCSF, and many others Applications in important domains pp p Data mining (Bernadó et al, 02; Wilson, 02a; Bacardit & Butz, 04) Function approximation (Wilson 02b) (Wilson, Reinforcement Learning (Lanzi, 02) Theoretical analyses f d i (B t et al., 02 03 04b) Th ti l l for design (Butz t l 02, 03, But still, there are important challenges to face Slide 4 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Two Key Challenges in ML and LCSs 1st challenge: Learning from domains that contain rare classes g g Data classification: Extract interesting, useful, and hidden patterns The most interesting knowledge resides in rare classes Example: fraud detection in credit card transactions Can learners model rare classes accurately? M b not! Cl dl l t l ? May be t! Knowledge Model Dataset Learner Minimize learning error + Mi i i l i maximize generalization What about online learning? More challenging: Model rare classes on the fly Aim: Analyze and improve LCS for mining domains with rarities Slide 5 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Two Key Challenges in ML and LCSs 2nd challenge: Building more understandable models and g g bring reasoning mechanisms close to human ones In some domains, interpretability is more important than accuracy LCSs most often use interval-based rules in domains described by continuous variables Variables V i bl are “ “semantic-free” ti f ” Analyses of the inference mechanisms are scarce Fuzzy logics provides a robust framework for knowledge representation and reasoning under uncertainty d i tit Some fuzzy LCS approaches already exist But no online fuzzy LCS for supervised learning has been designed Aim: Incorporate fuzzy logics into LCS for supervised learning Slide 6 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Goal of this Work General Goal: Address the two challenges with g The extended classifier system (XCS) (Wilson, 95, 98) By far, the most influential Michigan-style LCS The supervised classifier system (UCS) (Bernadó-Mansilla, 03) Inherits XCS’s architecture and specialized it for data classification XCS s Two challenges with two LCSs that lead to four objectives 2 4 Challenges Objectives Revise and update UCS and compare it with XCS 1. XCS and UCS Analyze and improve LCS for mining rarities 2. LCS and rare classes Apply LCSs for extracting models from real-world 3. classification problems with rarities Design and implement an LCS with fuzzy logic Fuzzy logics in LCS 4. reasoning for supervised learning Slide 7 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets For Supervised Learning 7. Conclusions and Further Work Slide 8 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Description of XCS In training mode for single step tasks (Wilson, 95) ENVIRONMENT Match Set [M] Problem instance 1C A PεF num as ts exp Selected 3C A PεF num as ts exp action Designed for reinforcement learning: g g 5C A PεF num as ts exp Population [P] 6C A PεF num as ts exp Match set REWARD Error: Error of the predicted payoff … generation 1C A PεF num as ts exp 2C A PεF num as ts exp Select action Fitness: Computed as a function of the error 3C A PεF num as ts exp randomly 4C A PεF num as ts exp 5C A PεF num as ts exp 6C A PεF num as ts exp Random Action … Action Set [A] [] Classifier 1C A PεF num as ts exp Deletion Selection, reproduction, Parameters 3C A PεF num as ts exp and mutation 5C A PεF num as ts exp Update 6C A PεF num as ts exp (Widrow Hoff (Widrow-Hoff rule) … Genetic Algorithm Fitness Sharing Competition in the niche Slide 9 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Description of UCS In training mode (Bernadó-Mansilla & Garrell, 03) Stream of ENVIRONMENT examples Match Set [M] Problem instance + output class 1C A acc F num cs ts exp 3C A acc F num cs ts exp Population [ ] p [P] 5C A acc F num cs ts exp 6C A acc F num cs ts exp … 1C A acc F num cs ts exp Classifier 2C A acc F num cs ts exp Parameters 3C A acc F num cs ts exp correct set 4C A acc F num cs ts exp generation Update p 5C A acc F num cs ts exp Average of the 6C A acc F num cs ts exp Match set Correct Set [C] parameter values … generation No fitness sharing 3 C A acc F num cs t exp ts Selection, Reproduction, Deletion 6 C A acc F num cs ts exp and mutation … Competition Genetic Algorithm in the niche Key differences with respect to XCS Accuracy computation as average of correct predictions Exploration of the “correct class instead of all classes correct class” No fitness sharing Slide 10 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 11 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Fitness Sharing in UCS Sharing or not sharing, a key difference between XCS and UCS Goal Design a fitness sharing scheme Empirically compare whether fitness sharing is beneficial to UCS Empirically compare XCS with UCS Incorporate a fitness sharing scheme into UCS Classifier accuracy Take inspiration from XCS Classifier numerosity Relative accuracy Learning rate And finally, fitness is shared in [M] Slide 12 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Methodology of Analysis Analysis divided into two comparisons Compare UCS without fitness sharing (UCSns) and with fitness sharing (UCSs) 1. Compare UCSs with XCS 2. Comparison on four boundedly-difficult problems, that permit moving the complexity along: number of classes, size of the building block, l b ildi bl k class i b l imbalance, and proportion of noise. d ti fi The parity problem (par) The d Th decoder problem (d ) d bl (dec) The position problem (pos) The 20 bit multiplexer with alternating noise (mux-an) 20-bit (mux an) Slide 13 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Does Fitness Sharing Benefit UCS? Fitness sharing provides the following benefits: gp g Higher pressure toward deletion of over-general classifiers Higher selective p g pressure toward the fittest classifiers in [ ] [C] Better results in the four problems: par, dec, pos, and mux-an UCSns vs UCSs in Decoder UCSs UCSns Slide 14 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Comparison of UCS with XCS Advantages of UCS due to The exploration regime XCS explores all the classes while UCS explores only the “correct” class The accuracy guidance XCS may provide a misleading guidance toward the fittest classifiers identified as the fitness dilemma (Butz et. al, 2003) UCS solves this problem by computing accuracy as the proportion of correct predictions UCSs vs XCS in Decoder UCSs XCS Slide 15 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Summary of the Comparison The empirical study has shown that p y UCS benefits from a fitness sharing scheme. Therefore, we use UCSs in the remaining of this work g Key differences between XCS and UCS reviewed and experimentally analyzed Explore regime Accuracy guidance Population size XCS is a more general architecture and can solve reinforcement learning problems Slide 16 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 17 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Motivation So, does rare classes pose a challenge to XCSs? , p g Test on unbalanced 11-bit multiplexer number of examples of the majority class IR = number of examples of the minority class %[O] with XCS ith Slide 18 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Design Decomposition Aim Analyze the challenges that rare classes pose to XCS Improve XCS in problems with rare classes Design decomposition approach (Goldberg, 02) proposes to Decompose the problem in critical elements p p Derive “little” models or facetwise models for each element, assuming that the others behave in an ideal manner Integrate all the models (patchquilt integration) Slide 19 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Focusing the Problem How should XCS partition the problem solution? p p Nourished niche Small Disjunct or Starved niche Again more small disjuncts Overgeneral Classifier Slide 20 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Critical Elements of LCS Five critical elements to detect small niches were identified Five critical elements: 1. Estimate the classifier parameters correctly 2. Analyze whether representatives of starved niches can be provided in initialization 3. Ensure the generation and growth of representatives of starved niches 4. 4 Adjust the GA application rate 5. Ensure that representatives of starved niches will take over their niches Derivations studied according to the imbalance ratio (IR) Slide 21 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Estimate Classifier Parameters 1 Derive the maximum imbalance ratio The error of over-general classifiers is: However, empirical results did not agree with the theory Error of the most over-general classifier over time tracked g Theoretical value Deviation between theoretical and ir = 100 empirical error Over-general Over general classifiers may be considered accurate Slide 22 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Estimate Classifier Parameters 1 We proposed two alternatives to obtain better estimates Theoretical value Tune the learning rate of the 1. Widrow Hoff Widrow-Hoff rule according to ir ir = 100 Theoretical value Apply gradient descent 2. methods (B t et. al, 2005) th d (Butz t l ir = 100 00 Slide 23 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 2 Provide Representatives in Initial. Can covering provide schemas of classifiers of starved niches? gp Probability of activating covering in the first minority class instance Specificity of [P] Imbalance ratio Length of the classifier For large values of ir, covering will not provide schemas of the minority class We W continue the analysis assuming a ti th li i covering failure Slide 24 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 3 Ensure Growth of Representatives How to size the population to ensure that representatives of pp p starved niches will be supplied? Assumptions: Crossover is not considered. Only mutation (probability of mutation μ). The time to create a representative of a starved niche is Random deletion A GA is applied to [A] every time [A] is activated Time to receive a genetic event Mixing all together: Population size bound to ensure reproductive opportunity Number of classes Imbalance ratio Slide 25 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 3 Ensure Growth of Representatives Theory matches empirical results (parity problem) y p (p yp ) Imbalanced parity problem with building block length from 1 to 4 Unbalanced by removing instances of one of the classes Theory matches also when the assumptions of the model are not met Widrow-Hoff Rule All assumptions satisfied Slide 26 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Adjust GA Application Rate 4 Assumption in the previous model p p A GA is applied to [A] every time [A] is activated What is the effect of varying GA? To guarantee that all niches receive the same number of genetic events approximately: If satisfied, all niches receive the same number of genetic opportunities Thence, time of deletion increases linearly with ir and population size remains constant Slide 27 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 5 Ensure Take Over of Represent. The previous facets set the conditions to ensure that p Representatives of starved niches are created 1. Representatives of starved niches receive a g p genetic event 2. But still, to ensure full convergence we need that Representatives of starved niches take over their niche Ensure that these representatives will not be extinguished Study takeover time of representatives, which depends on Initial stock of classifiers in the niche Type of selection Proportionate selection (Wilson, 95) Tournament selection (Butz et al., 2005c) Slide 28 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 5 Ensure Take Over of Represent. Takeover time for proportionate selection pp Population size Ratio of the accuracy of the Number of niches over-general classifier to the Final proportion of classifiers accuracy of the best representative Initial proportion of classifiers Condition for niche extinction Maximum predicted by the acceptable error niche extinction model Slide 29 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • 5 Ensure Take Over of Represent. Takeover time for tournament selection Population size Tournament size Final proportion of classifiers Initial proportion of classifiers Condition for niche extinction Key differences with respect to proportionate selection: Independent of the fitness of the best and the over-general classifier Highly dependent on the tournament size Number of representatives predicted by the Number of classifiers niche extinction model in the niche Slide 30 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Patchquilt Integration Will XCS learn rare classes? Lessons learned from the models Parameters need to be correctly estimated 1. Widrow-Hoff Widrow Hoff rule with auto-adjusted β auto adjusted Gradient descent methods Representatives need to be created and evolved 2. Covering may fail if ir is large The h ll Th challenge can b met b be t by Sizing the population according to the imbalance ratio Setting θGA according to the imbalance ratio Niche extinction models set the conditions under which XCS will fail 3. Indicate how parameters should be tuned to satisfy the model Takeover time models to predict the time to convergence Slide 31 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Why Is this Analysis Important? The lessons enable us to solve problems that previously eluded solution Unbalanced 11-bit multiplexer problem After the %[O] with XCS analysis Before the analysis Before we could solve up to ir=32 p Now we can solve up to ir=1024 and more Slide 32 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Iimbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 33 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Reviewing the Critical Elements Estimate the classifier parameters correctly 1 Pure averages! We get the exact value 2 Analyze whether representatives of starved niches can be provided in initialization Covering applied if the correct set is empty If no mutation, covering will be always applied to the first minority g y pp y class instances Suppose the worst case: no provision We derive maximum bounds Slide 34 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Reviewing the Critical Elements Ensure the generation and growth 3 of representatives of starved niches Default configuration Imbalance ratio I bl i All assumptions satisfied Adjust the GA application rate 4 XCS’s model is still valid Ensure that representatives of starved niches will take over their niches 5 XCS’s takeover time models are still valid Slide 35 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Patchquilt Integration The lessons enable us to solve problems that previously eluded solution Results following the guidelines provided by the lessons %[O] with UCS Slide 36 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 37 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Motivation From boundedly-difficult problems to real-world problems y p p RWP contain continuous attributes Interval-based rules IF x1 i [l1, u1] and x2 i [l2, u2] and … and xn i [ln, nn] THEN classi l in d in d d in Key difference: Problem characteristics not known y Gap between theory and application to RWP How can we apply the recommendations extracted from the analysis? Aim Sta t b dg g the Start bridging t e gap between theory and practice bet ee t eo y a d p act ce 1. Confirm that both LCS are valuable for mining domains with rarities 2. Slide 38 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • What is Different in RWP Imbalance ratio vs. niche imbalance ratio? In boundedly-difficult problems IR equaled to the niche imbalance ratio In RWP, this assumption may not hold p y Same imbalance ratio, different niche imbalance ratio Niche imbalance ratio (NIR) in RWP depends on: IR Geometrical distribution of the examples Knowledge representation Slide 39 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Self-Adaptation to Unknown Domains Heuristic to estimate the niche imbalance ratio Take the strongest over-general classifier Assume NIR is the imbalance ratio of the over-general classifier g Tune parameters according to NIR and the recommendations extracted from the facetwise analysis Empirical test on the 11-bit multiplexer problem %[O] with XCS %[B] with UCS Slide 40 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • LCS in RWP Id. Data set #Ins. #At. ir Comparison methodology bald1 balance disc. 1 625 4 11.76 bald2 b ld2 balance di bl disc. 2 625 62 4 1.17 11 Comparison with: bald3 balance disc. 3 625 4 1.17 C4.5 (Quinlan, 95) bpa bupa 345 6 1.38 g glsd1 g glass disc. 1 214 9 22.75 SMO (Platt, 98) (Pl tt glsd2 glass disc. 2 214 9 15.47 IBk (Aha et al., 91) glsd3 glass disc. 3 214 9 11.59 glsd4 glass disc. 4 214 9 6.38 Co gu ed Configured to maximize performance a e pe o a ce glsd5 glass disc 5 disc. 214 9 2.06 2 06 glsd6 glass disc. 6 214 9 1.82 Selection of 25 imbalanced real-world h-s heart-disease 270 13 1.25 problems with different characteristics pim pima-inidan 768 8 1.87 tao tao-grid 1888 2 1.00 10-fold cross validation thyd1 thyroid disc. 1 215 5 6.17 thyd2 thyroid disc. 2 215 5 5.14 Performance measure: TP rate · TN rate thyd3 thyroid disc. 3 215 5 2.31 wavd1 waveform disc. 1 5000 40 2.02 Statistical tests: wavd2 waveform disc. 2 5000 40 1.96 Friedman’s test (Friedman, 37, 40) wavd3 waveform disc. 3 5000 40 2.02 wbcd bd Wis. B. Wi B cancer 699 9 1.90 1 90 Nemenyi test (Nemenyi, 63) wdbc Wis. diag. 569 30 1.68 Wilcoxon signed-ranks test (Wilcoxon, 45) wined1 wine disc. 1 178 13 2.71 wined2 wine disc. 2 178 13 2.02 wined3 wine disc. 3 178 13 1.51 wpbc wine disc. 4 198 33 3.21 Slide 41 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Summary of the Results TP rate · TN rate XCS and UCS perform the best on average for the tested problems However, no significant differences according to Friedman’s test Pairwise analysis enables the extraction of further observations XCS and UCS fail to create accurate models in problems such as bald2, bald3, and tao, which have low imbalance ratio Presents difficulties to learn from domains with curved boundaries Oth complexities i addition t class i b l l iti in dditi to l imbalance Other Slide 42 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Discussion When a ML practitioner has a new problem p p Which learner should she or he apply? The empirical analysis indicated that She or he should bet for LCSs But no guarantees of being the best performer on a particular problem What is missing? Evaluate problem complexity Link problem complexity with domain of competence of LCS How? Complexity metrics is a good starting point (Ho & Basu, 02) to bridge the gap between theory and practice Slide 43 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 44 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Motivation Competent data classification techniques should be able to Evolve E l accurate models t dl in some legible structure LCS are very appealing since evolve highly accurate models online li i l hi hl t dl li However: Tend to evolve a large number of semantic-free interval-based rules Use reasoning mechanisms that can be little intuitive (Bernadó et al., 02) Slide 45 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Design of Fuzzy-UCS Linguistic fuzzy representation Disjunction of linguistic fuzzy terms IF x1 is A1 and x2 is A2 … and xn is An THEN class1 Rule: IF x1 is small and x2 is medium or large THEN class1 Example: In our experiments, all variables shared the same semantics, which were defined by triangular membership f d fi d b t i l b hi functions ti small medium large C ass e pa a ete s e e changed Classifier parameters were c a ged to let t e dea with fuzzy matching et them deal t u y atc g Slide 46 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Design of Fuzzy-UCS Three procedures designed to infer the class of test examples, p g p, which result in a tradeoff between intepretability and accuracy Weighted average Action winner Most numerous and (wavg) (awin) fittest rules (nfit) + size of the rule set - Based on average voting. All rules considered. wavg Best rule decides the class. Only best matching rules considered. y g awin Based on average voting. Only most numerous rules considered. nfit Slide 47 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Methodology of Analysis Id Data set #Ins #At #Cl. %Min %Maj %MI Comparison methodology gy ann Annealing 898 38 5 0.9 76.2 0.0 Two comparisons aut Automobile 205 25 6 1.5 32.7 22.4 bal Balance 625 4 3 7.8 46.1 0.0 Fuzzy learners bpa Bupa 345 6 2 42.0 42 0 58.0 58 0 0.0 00 Non-fuzzy learners cmc Contrac. choice 1473 9 3 22.6 42.7 0.0 Selection of 20 real-world problems col Horse colic 368 22 2 37.0 63.0 98.1 gls Glass 214 9 6 4.2 35.5 0.0 10-fold 10 fold cross validation h-c Heart-c 303 13 2 45.5 54.5 2.3 Metrics h-s Heart-s 270 13 2 44.4 56.6 0.0 irs Iris 150 4 3 33.3 33.3 0.0 Test accuracy y pim Pima 768 68 8 2 34.9 39 65.1 61 0.0 00 Number of rules of the models son Sonar 208 60 2 46.7 53.3 0.0 tao Tao 1888 2 2 50.0 50.0 0.0 Statistical tests: thy Thyroid 215 5 3 14.0 14 0 60.0 60 0 0.0 00 Friedman’s test (Friedman, 37, 40) veh Vehicle 846 18 4 23.5 25.8 0.0 Nemenyi test (Nemenyi, 63) wbcd Wisc. breast-cancer 699 9 2 34.5 65.5 2.3 wdbc Wisc. Diagnosis 569 30 2 37.3 62.7 0.0 Bonferroni-Dunn Bonferroni Dunn test (Dunn, 61) (Dunn wne Wine 178 13 3 27.0 39.9 0.0 Wilcoxon signed-ranks test (Wilcoxon, 45) wpbc Wisc. Prognostic 198 33 2 23.7 76.3 2.0 zoo Zoo 101 17 7 4.0 40.6 0.0 Slide 48 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Comparison with the Fuzzy Learners Accuracy Fuzzy GP (GP) (Sánchez et al., 01) F (Sá h l 1. Fuzzy GAP (GAP) Sánchez & Couso, 00) 2. Fuzzy SAP (SAP) Sánchez et al, 01) 3. Fuzzy Ad b F Adaboost (AB) (del Jesus et al, 04) t (d l J tl 4. Fuzzy Logitboost (LB) (Otero & Sánchez, 06) 5. Fuzzy MaxLogitBoost (MLB) (Otero & Sánchez, 07) 6. All methods run using KEEL (Alcalá-Fdez et. al, 08) - Interpretability + Fuzzy-UCS nfit Fuzzy-UCS wavg Fuzzy-UCS awin (> 10 rules) (1000 s (1000’s of rules) (< 100 rules) Fuzzy AdaBoost Fuzzy GAP, Fuzzy SAP Fuzzy LogitBoost Fuzzy GP, Fuzzy MLB Slide 49 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Comparison with Non-Fuzzy Learners Accuracy C4.5 (Quinlan, 95) 1. IBk (Aha et al., 91) 2. Naïve Bayes (NB) (John & Langley, 95) 3. Part (Frank & Witten, 98) 4. SMO (Platt, 98) 5. GAssist (Bacardit, 04) 6. UCS (Bernadó & Garrell, 03) 7. - Interpretability + Fuzzy-UCS awin Fuzzy-UCS avg Fuzzy-UCS nfit UCS SMO C4.5 GAssist IBk Part Naïve Bayes Slide 50 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Mining Large Volumes of Data The last experiment p Fuzzy-UCS to extract models from the 1999 KDD Cup intrusion detection mechanism data set 494,022 examples with 41 features Slide 51 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Outline 1. Description of XCS and UCS 2. Revisiting UCS: Fitness Sharing and Comparison with XCS 3. 3 Facetwise Analysis of XCS for Imbalanced Domains 4. Carrying over the Facetwise Analysis into UCS 5. XCS and UCS in Imbalanced Real-World Classification Problems 6. Fuzzy-UCS: Evolving Fuzzy Rule Sets for Supervised Learning 7. Conclusions and Further Work Slide 52 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Conclusions and Further Work This work contributed to Increasing the comprehension of how LCS work Improving them to deal with p p g problems that contain rare classes Providing new implementations of LCS Two challenges and four objectives addressed in the context of LCS 1. Revise and update UCS and compare it to XCS New fitness sharing designed Fitness sharing provides benefits to UCS Key differences between UCS and XCS empirically studied Further work: Complement the analysis with theory Slide 53 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Conclusions and Further Work 2 & 3. Study LCS in domains with rare classes y Start with a systematic analysis validated with boundedly-difficult problems Further work Finish with its application to real-world problems with rare classes pp p Design D i measures t characterize real world classification problems to h ti l ld l ifi ti bl Facetwise Complex Measure the difficulty of the problems analysis Problem systems Link problem diffi lt with d Li LCSs can learn k bl difficulty ith domain of competence if t from imbalanced Include problem difficultyof the study of re-sampling techniques, etc. Lots in Small models domains interacting First steps taken in components et al 06; Orriols et al 08a) (Bernadó et. al, et. al, Problem Application of characterization Domain of D if LCSs to a new competence real-world problem Heuristic to estimate of LCSs the niche imbalance ratio Complexity metrics Future research line Resampling pg techniques Slide 54 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Conclusions and Further Work 4. Design and implement an LCS with fuzzy logic reasoning for g p yg g supervised learning Analysis to mix Further work Accurate online evaluation system of LCSs Adapt LCSs to extract association rulesreasoning mechanisms of fuzzy logics Human like representation and online Many Robust discovery capabilities of GAs real-world applications generate data streams LCS are appealing ideas was not novel itself, but the combination of Each of the three since they mine data streams them to create a supervised learning technique was was. However, in most cases, unlabeled data Fuzzy-UCS Aim: design an LCS that is able to extract association rules online Evolved highly accurate models of moderate size First steps taken in (Orriols et al., 2008f) Was able to extract classification models from large volumes of data Is prepared to deal with domains with uncertainty and vagueness Slide 55 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Lessons Learned on the Way The importance of design decomposition 1. We W need t improve LCS f mining rarities d to i for i i iti Mix existing, powerful techniques that solve problems that you intuitively 1. identify The thesis started in this way (Orriols-Puig, 05a, 05b) Lesson: despite moderate success, poor understanding Build complete models of your system 2. Design decomposition and facetwise analysis (Goldberg, 02) 3. Key for success Not only for GAs or LCSs The relevance of ideas crossbreeding 2. New complex real-world problems require the best practices of different fields LCSs are friendly frameworks to ideas crossbreeding Slide 56 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Publications This work has resulted in 35 publications: 7j journal papers (4 accepted/published and 3 currently submitted) l d/ bli h d d l b i d) 5 papers in LNCS/LNAI volumes 6 book chapters 15 international conference papers 2 national conference papers Selected publications Albert Orriols-Puig, Ester Bernadó-Mansilla, David E. Goldberg, Kumara Sastry, and Pier Luca Lanzi. Facetwise Analysis of XCS for Problems with Class Imbalances IEEE Transactions on Evolutionary Computation 2008 submitted Imbalances. Computation, 2008, Albert Orriols-Puig, Jorge Casillas and Ester Bernadó-Mansilla. Fuzzy-UCS: A Michigan-style Fuzzy-Learning Classifier System for Supervised Learning. IEEE Transactions on Evolutionary Computation, 2008, doi=10.1109/TEVC.2008.925144 Albert Orriols-Puig, Ester Bernadó-Mansilla. Evolutionary Rule-Based Systems for Imbalanced Datasets. Soft Computing Journal. Special Issue on Evolutionary and Metaheuristic-based Data Mining, 2008, doi=10.1007/s00500-008-0319-7 Albert Orriols-Puig and Ester Bernadó-Mansilla. Revisiting UCS: Description, Fitness Sharing, and Comparison with XCS. In Advances at the frontier of LCS, LNCS series, volume 4998, pages 96–116, Springer, 2008 Albert Orriols P ig Da id E Goldberg K mara Sastr and Ester Bernadó Mansilla Modeling XCS in Class Imbalances Orriols-Puig, David. E. Goldberg, Kumara Sastry, Bernadó-Mansilla. Imbalances: Population Size and Parameter Settings. In GECCO’07, pages 1838-1845, ACM Press, 2007 Albert Orriols-Puig, Kumara Sastry, Pier Luca Lanzi, David E. Goldberg, and Ester Bernadó-Mansilla. Modeling Selection Pressure in XCS for Proportionate and Tournament Selection. In GECCO’07, pages 1846-1853, ACM Press, 2007 Albert Orriols-Puig and Ester Bernadó-Mansilla. Bounding XCS’s Parameters for Unbalanced Datasets. Best paper nomination. In GECCO’06, pages 1561-1568. ACM Press, 2006 Slide 57 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • Acknowledgments Enginyeria i Arquitectura La Salle Prof. Ester Bernadó-Mansilla My first “second home”: the IlliGAL Prof. David E. Goldberg for accepting my visits and for all his valuable lessons All labbies, and especially Kumara Sastry, Xavier Llorà, and Tian Li Yu My second “second home”: the SCI2S group Prof. Francisco Herrera for accepting my visits and for his time and advice All labbies and especially Jorge Casillas labbies, My examining committee Prof. David E. Goldberg, Prof. Francisco Herrera, Prof. Martin V. Butz, Prof. Xavier Llorà, and Prof. Xavier Vilasís All the people I have worked with Ester Bernadó-Mansilla, Jorge Casillas, David E. Goldberg, Pier Luca Lanzi, Francisco J. Martínez-López, Sergio Morales-Ortigosa , Núria Macià, Joaquim Rios-Boutin, Kumara Sastry, Francesc Teixidó-Navarro The Th research was supported by h t db Departament d’universitats, recerca i societat de la informació (DURSI) Under a FI scholarship with reference 2005FI-00252 Under two BE travel grants with references 2006BE-00299 and 2007BE2-00124 Generalitat de Catalunya, under grants 2002SGR-00155 and 2005SGR-00302 Ministerio de educación y ciencia under projects KEEL and KEEL2 with references (TIC2002-04036-C05-03 and TIN2005-08386-C05-04) TIN2005 08386 C05 04) Slide 58 Grup de Recerca en Sistemes Intel·ligents New Challenges in LCS
  • New Challenges in Learning Classifier g g Systems: Mining Rarities and Evolving Fuzzy Rules Student: Albert Orriols-Puig Supervisor: Ester Bernadó-Mansilla Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle Universitat Ramon Llull