SlideShare a Scribd company logo
Final Presentation INF/SCR-06-54
                                                       Applied Computing Science, ICS




            A Comparison of Evaluation
                Methods in Coevolution


                               Ting-Shuo Yo

                        Supervisor: Edwin D. de Jong
                                   Arno P.J.M. Siebes
Outline

●   Introduction
●   Evaluation methods in coevolution
●   Performance measures
●   Test problems
●   Results and discussion
●   Concluding remarks
Introduction

●   Evolutionary computation
●   Coevolution
●   Coevolution for test-based problems
●   Motivation of this study
Genetic Algorithm
Initialization



                       2. SELECTION           Parents

  1. EVALUATION
                                      3. REPRODUCTION
         Population                      (crossover, mutation,...)


                 4. REPLACEMENT
                                             Offspring
   While (not TERMINATE)


                                  TERMINATE
                                                                End
Coevolution
Initialization


                          1. EVALUATION



                             ................
       Subpopulation                            Subpopulation
        2. SELECTION                            2. SELECTION
        3. REPRODUCTION                         3. REPRODUCTION
        4. REPLACEMENT                          4. REPLACEMENT



                      While (not TERMINATE)



                                       TERMINATE
                                                                  End
Test-Based Problems
f(x)
                 original function

                 regression curve                               s1
                                                                s2
                                                                 s3




                                                                      x
       t1   t2      t3      t4       t5   t6   t7   t8   t9   t10
Coevolution for Test-Based Problems

         Test           1. EVALUATION
       population       Interaction:
     2. SELECTION        ●   Does the solution solve the
     3. REPRODUCTION         test?
     4. REPLACEMENT      ●   How good does the solution
                             perform on the test?

       Solution
      population        Solutions: the more tests it
                          solves the better.
      2. SELECTION
      3. REPRODUCTION   Tests: the less solutions pass it
      4. REPLACEMENT      the better.
Motivation

●   Coevolution provides a way to select tests
    adaptively → stability and efficiency
●   Solution concept → stability
●   Efficiency depends on selection and
    evaluation.
●   Compared to evaluation based on all relevant
    information, how do different coevolutionary
    evaluation methods perform?
Concepts for Coevolutionary
            Evaluation Methods

●   Interaction
●   Distinction and informativeness
●   Dominance and multi-objective approach
Interaction
●   A function that returns the outcome of interaction
    between two individuals from different
    subpopulations.
    –   Checkers players: which one wins
    –   Test / Solution: if the solution succeeds in solving the
        test                             S 1 S 2 S 3 S 4 S 5 sum
                                   T1   0   1    0   0    1   2
                                   T2   0   0    1   1    0   2
●   Interaction matrix             T3   0   1    1   0    0   2
                                   T4   1   0    0   0    0   1
                                   T5   1   0    1   0    0   2
                                  sum   2   2    3   1    1
Distinction
                   Solutions                                 T3
              S1   S2   S3   S4   S5   sum         S1   S2   S3   S4   S5 sum
      T1      0     1    0    0    1   2     S1    -     0    0    0    0
      T2      0     0    1    1    0   2     S2    1     -    0    1    1
Test  T3      0     1    1    0    0   2     S3    1     0    -    1    1
cases T4      1     0    0    0    0   1     S4    0     0    0    -    0
      T5      1     0    1    0    0   2     S5    0     0    0    0    -
        sum   2     2    3    1    1         sum   2     0    0    2    2 6

●   Ability to keep diversity on the other subpopulation.
●   Informativeness
Dominance and MO approach
    f2

                      non-dominated
                                           S1 is dominated by S2 iff:
                      dominated




                                      f1



●        Keep the best for each objective.
●        MO: number of individuals that dominate it
Evaluation Methods

●   AS: Averaged Score
●   WS: Weighted Score
●   AI: Averaged Informativeness
●   WI: Weighted Informativeness
●   MO
AS and WS
●       AS : (# positive interaction) / (# all interaction)
                                   Solutions
                            S1    S2 S3 S4 S5 sum
                      T1    0      1 0 0 1 2              0.4
                      T2    0      0 1 1 0 2              0.4
                Test  T3    0      1 1 0 0 2              0.4
                cases T4    1      0 0 0 0 1              0.2
                      T5    1      0 1 0 0 2              0.4
                      sum   2      2 3 1 1
                            0.4   0.4   0.6   0.2   0.2

    ●   WS : each interaction is weighted differently.
AI and WI
●   AI : # of distinctions it makes
●   WI : each distinction is weighted differently.
           S1>S2 S1>S3 S1>S4 S1>S5 .............
      T1     1     1     0     1        ....                                5
      T2     0     0     0     1        ....                                2
      T3     1     1     0     0        ....                                6
      T4     0     1     0     1        ....                                2
      T5     0     0     0     0        ....                                1

     In the algorithm actually a weighted summation of AS and informativeness is used.
       0.3 x informativeness + 0.7 x AS
MO

●   Objectives : each individual in
    the other subpopulation.
●   MO: number of individuals that
    dominate it.                      f2

                                           non-dominated

●   Non-dominated individuals              dominated



    have the highest fitness value.


                                                           f1
Performance Measures
●   Objective Fitness (OF)
    –   Evaluation against a fix set of test cases
    –   Here we use "all possible test cases" since we have
        picked problems with small sizes.

●   Objective Fitness Correlation (OFC)
    –   Correlation between OFs and fitness values in the
        coevolution (subjective fitness, SF).
Experimental Setup
●   Controlled experiments: GAAS
     –   GA with AS from exhaustive evaluation.




●   Compare the OF based on the same number of
    interactions.
Test Problems
●   Majority Function Problem (MFP)
    –   1D cellular automata problem
    –   Two parameters: radius (r) and problem size (n)

A sample IC with n = 9               0 1 0 1 0 0 1 1 1
                                      neighbor bits
                                          target bit

                            Input    000       001     010   011   100   101   110   111
A sample rule with r = 1
                            Output   0           0      0     1     0    1     1       1



                                          boolean-vector representation of this rule
Test Problems
●   Majority Function Problem (MFP)
Test Problems
●   Symbolic Regression Problem (SRP)
    –   Curve fitting with Genetic Programming trees
    –   Two measures: sum of error and hit
                      +            f(x)
                                          original function
GP Tree
                                          regression curve
                                                              hit
                 *         +


             -       x x       x


         x       x


                     2x
                                                                    x
Test Problems
●   Parity Problem (PP)
    –   Determine odd/even for the number of 1's in a bit
        string
    –   Two parameter: odd/even and bit string length (n)
         A problem with n = 10
                                 0 1 0 1 0 0 1 1 11

         A solution tree
Test Problems: PP
                                            5-even Parity
Boolean-vector       0 0 0 1 0                                  false (0)

                     D0 D1 D2 D3 D4


                                0
                               AND                      false
GP Tree

                      1               0
                     OR              AND


             1                   1          0
           NOT AND        D2    NOT OR     AND
                          0

          D0         D3        D0    D1    D1      D2
          0           1         0    0      0       0
Results of MFP (r=2, n=9)
Results of MFP (r=2, n=9)
Results of SRP   6   4
                 x −2x x
                            2
Results of SRP   6   4
                 x −2x x
                            2
Results of PP (odd, n=10)
Results of PP (odd, n=10)
Summary of Results
Multi-objective Approach

               ●   One run for COMO in
                   MFP.
               ●   OF drops when NDR
                   rises.
               ●   Why high NDR?
                   –   Duplicate solutions
                   –   Too many objectives
MO approach to improve WI
MFP
                        MO-WS-WI
MO approach to improve WI
SRP
                            MO-AS-WI
                            MO-WS-WI


                           WeiSum-AS-WI


                          MO-AS-AI
                          MO-WS-AI
MO approach to improve WI
PP




                   MO-AS-WI
Conclusions
●   MO2 approach with weighted informativeness
    (MO-AS-WI and MO-WS-WI) outperforms other
    evaluation methods in coevolution.
●   MO1 approach does not work well because
    there are usually too many objectives. This can
    be represented by a high NDR and results in a
    random search.
●   Coevolution is efficient for the MFP and SRP.
Issues
●   Test problems used are small, and there is not
    proof of generalizability to larger problems.


●   Implication to statistical learning: select not only
    difficult but also informative data for training.
Question?
Thank you!
Average Score
                  Solutions
           S1    S2 S3 S4 S5
      T1   0      1 0 0 1 2              0.4

      T2   0      0 1 1 0 2              0.4
Test  T3   0      1 1 0 0 2              0.4
cases T4   1      0 0 0 0 2              0.4
      T5   1      0 1 0 0 2              0.4

           2      2 3 1 1
           0.4   0.4   0.6   0.2   0.2




                                               Max(O(m),O(n))
Weighted Score
                Solutions
             S1 S2 S3 S4 S5
        T1   0 1 0 0 1        2
        T2   0 0 1 1 0        2
Test    T3   0 1 1 0 0        2
cases   T4   1 0 0 0 0        2
        T5   1 0 1 0 0        2
             2 2 3 1 1




                                      Max(O(m),O(n))
Average Informativeness




                Max(O(mn2),O(nm2))
Weighted Informativeness




                  Max(O(mn2),O(nm2))
MO




     Max(O(mn2),O(nm2))

More Related Content

Viewers also liked

Making the most out of corporate social responsibility
Making the most out of  corporate social responsibilityMaking the most out of  corporate social responsibility
Making the most out of corporate social responsibilityAneesh Suresh
 
From theory based policy evaluation to smart policy design: lessons learned f...
From theory based policy evaluation to smart policy design: lessons learned f...From theory based policy evaluation to smart policy design: lessons learned f...
From theory based policy evaluation to smart policy design: lessons learned f...
IEA DSM Implementing Agreement (IA)
 
RESULT BASED M&E in FFA-revised
RESULT BASED M&E in FFA-revisedRESULT BASED M&E in FFA-revised
RESULT BASED M&E in FFA-revisedStephen Musimba
 
CSR, Sustainable Business and Strategy
CSR, Sustainable Business and StrategyCSR, Sustainable Business and Strategy
CSR, Sustainable Business and Strategy
Innovation Forum Publishing
 
Step by step guide to sustainability planning
Step by step guide to sustainability planningStep by step guide to sustainability planning
Step by step guide to sustainability planning
Kenny Nguyen
 
Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...
Soumya Kanti Datta
 
Outcome Mapping: Monitoring and Evaluation Tool
Outcome Mapping: Monitoring and Evaluation ToolOutcome Mapping: Monitoring and Evaluation Tool
Outcome Mapping: Monitoring and Evaluation Tool
International Institute of Tropical Agriculture
 
Patagonia, integrating CSR into business model creation and strategic management
Patagonia, integrating CSR into business model creation and strategic managementPatagonia, integrating CSR into business model creation and strategic management
Patagonia, integrating CSR into business model creation and strategic management
Corporate Excellence - Centre for Reputation Leadership
 
Presentation Training on Result Based Management (RBM) for M&E Staff
Presentation Training on Result Based Management (RBM) for M&E StaffPresentation Training on Result Based Management (RBM) for M&E Staff
Presentation Training on Result Based Management (RBM) for M&E Staff
Fida Karim 🇵🇰
 
Results Based Monitoring and Evaluation
Results Based Monitoring and EvaluationResults Based Monitoring and Evaluation
Results Based Monitoring and Evaluation
Madhawa Waidyaratna
 
Balanced Scorecard for Strategic Planning and Measurement
Balanced Scorecard for Strategic Planning and MeasurementBalanced Scorecard for Strategic Planning and Measurement
Balanced Scorecard for Strategic Planning and Measurement
Kenny Ong
 
Monitoring and Evaluation Framework
Monitoring and Evaluation FrameworkMonitoring and Evaluation Framework
Monitoring and Evaluation Framework
Dr. Joy Kenneth Sala Biasong
 
Monitoring & evaluation presentation[1]
Monitoring & evaluation presentation[1]Monitoring & evaluation presentation[1]
Monitoring & evaluation presentation[1]skzarif
 
Project Monitoring & Evaluation
Project Monitoring & EvaluationProject Monitoring & Evaluation
Project Monitoring & Evaluation
Srinivasan Rengasamy
 
Results-Based Management in UNDP
Results-Based Management in UNDPResults-Based Management in UNDP
Results-Based Management in UNDP
UNDP Eurasia
 

Viewers also liked (16)

Making the most out of corporate social responsibility
Making the most out of  corporate social responsibilityMaking the most out of  corporate social responsibility
Making the most out of corporate social responsibility
 
Comparison and evaluation of alternative designs
Comparison and evaluation of alternative designsComparison and evaluation of alternative designs
Comparison and evaluation of alternative designs
 
From theory based policy evaluation to smart policy design: lessons learned f...
From theory based policy evaluation to smart policy design: lessons learned f...From theory based policy evaluation to smart policy design: lessons learned f...
From theory based policy evaluation to smart policy design: lessons learned f...
 
RESULT BASED M&E in FFA-revised
RESULT BASED M&E in FFA-revisedRESULT BASED M&E in FFA-revised
RESULT BASED M&E in FFA-revised
 
CSR, Sustainable Business and Strategy
CSR, Sustainable Business and StrategyCSR, Sustainable Business and Strategy
CSR, Sustainable Business and Strategy
 
Step by step guide to sustainability planning
Step by step guide to sustainability planningStep by step guide to sustainability planning
Step by step guide to sustainability planning
 
Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...
 
Outcome Mapping: Monitoring and Evaluation Tool
Outcome Mapping: Monitoring and Evaluation ToolOutcome Mapping: Monitoring and Evaluation Tool
Outcome Mapping: Monitoring and Evaluation Tool
 
Patagonia, integrating CSR into business model creation and strategic management
Patagonia, integrating CSR into business model creation and strategic managementPatagonia, integrating CSR into business model creation and strategic management
Patagonia, integrating CSR into business model creation and strategic management
 
Presentation Training on Result Based Management (RBM) for M&E Staff
Presentation Training on Result Based Management (RBM) for M&E StaffPresentation Training on Result Based Management (RBM) for M&E Staff
Presentation Training on Result Based Management (RBM) for M&E Staff
 
Results Based Monitoring and Evaluation
Results Based Monitoring and EvaluationResults Based Monitoring and Evaluation
Results Based Monitoring and Evaluation
 
Balanced Scorecard for Strategic Planning and Measurement
Balanced Scorecard for Strategic Planning and MeasurementBalanced Scorecard for Strategic Planning and Measurement
Balanced Scorecard for Strategic Planning and Measurement
 
Monitoring and Evaluation Framework
Monitoring and Evaluation FrameworkMonitoring and Evaluation Framework
Monitoring and Evaluation Framework
 
Monitoring & evaluation presentation[1]
Monitoring & evaluation presentation[1]Monitoring & evaluation presentation[1]
Monitoring & evaluation presentation[1]
 
Project Monitoring & Evaluation
Project Monitoring & EvaluationProject Monitoring & Evaluation
Project Monitoring & Evaluation
 
Results-Based Management in UNDP
Results-Based Management in UNDPResults-Based Management in UNDP
Results-Based Management in UNDP
 

Similar to A Comparison of Evaluation Methods in Coevolution 20070921

Selecting Discriminating Terms for Bug Assignment A Formal Analysis
Selecting Discriminating Terms for Bug Assignment A Formal AnalysisSelecting Discriminating Terms for Bug Assignment A Formal Analysis
Selecting Discriminating Terms for Bug Assignment A Formal AnalysisIbrahim Aljarah
 
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
CS, NcState
 
การสุ่มตัวอย่างในงานวิจัยสาธารณสุข
การสุ่มตัวอย่างในงานวิจัยสาธารณสุขการสุ่มตัวอย่างในงานวิจัยสาธารณสุข
การสุ่มตัวอย่างในงานวิจัยสาธารณสุขUltraman Taro
 
AlgoPerm2012 - 05 Ioan Todinca
AlgoPerm2012 - 05 Ioan TodincaAlgoPerm2012 - 05 Ioan Todinca
AlgoPerm2012 - 05 Ioan Todinca
AlgoPerm 2012
 
T test statistics
T test statisticsT test statistics
T test statistics
Mohammad Ihmeidan
 
Discrete Probability Distributions
Discrete  Probability DistributionsDiscrete  Probability Distributions
Discrete Probability DistributionsE-tan
 
Clasification approaches
Clasification approachesClasification approaches
Clasification approaches
gscprasad1111
 
Transactional Data Mining
Transactional Data MiningTransactional Data Mining
Transactional Data Mining
Ted Dunning
 

Similar to A Comparison of Evaluation Methods in Coevolution 20070921 (10)

Selecting Discriminating Terms for Bug Assignment A Formal Analysis
Selecting Discriminating Terms for Bug Assignment A Formal AnalysisSelecting Discriminating Terms for Bug Assignment A Formal Analysis
Selecting Discriminating Terms for Bug Assignment A Formal Analysis
 
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
Promise 2011: "Selecting Discriminating Terms for Bug Assignment: A Formal An...
 
การสุ่มตัวอย่างในงานวิจัยสาธารณสุข
การสุ่มตัวอย่างในงานวิจัยสาธารณสุขการสุ่มตัวอย่างในงานวิจัยสาธารณสุข
การสุ่มตัวอย่างในงานวิจัยสาธารณสุข
 
AlgoPerm2012 - 05 Ioan Todinca
AlgoPerm2012 - 05 Ioan TodincaAlgoPerm2012 - 05 Ioan Todinca
AlgoPerm2012 - 05 Ioan Todinca
 
T test statistics
T test statisticsT test statistics
T test statistics
 
Discrete Probability Distributions
Discrete  Probability DistributionsDiscrete  Probability Distributions
Discrete Probability Distributions
 
Algo complexity
Algo complexityAlgo complexity
Algo complexity
 
Clasification approaches
Clasification approachesClasification approaches
Clasification approaches
 
UML&FM 2012
UML&FM 2012UML&FM 2012
UML&FM 2012
 
Transactional Data Mining
Transactional Data MiningTransactional Data Mining
Transactional Data Mining
 

More from Ting-Shuo Yo

20141030 ntustme computer_programmingandbeyond_share
20141030 ntustme computer_programmingandbeyond_share20141030 ntustme computer_programmingandbeyond_share
20141030 ntustme computer_programmingandbeyond_share
Ting-Shuo Yo
 
Tag2Card User's manual v01
Tag2Card User's manual v01Tag2Card User's manual v01
Tag2Card User's manual v01
Ting-Shuo Yo
 
InnoCentive
InnoCentiveInnoCentive
InnoCentive
Ting-Shuo Yo
 
Introduction to BCI
Introduction to BCIIntroduction to BCI
Introduction to BCITing-Shuo Yo
 
Design Thinking and Innovation
Design Thinking and InnovationDesign Thinking and Innovation
Design Thinking and Innovation
Ting-Shuo Yo
 
Neighborhood Component Analysis 20071108
Neighborhood Component Analysis 20071108Neighborhood Component Analysis 20071108
Neighborhood Component Analysis 20071108
Ting-Shuo Yo
 
The Neurophysiology of Speech
The Neurophysiology of SpeechThe Neurophysiology of Speech
The Neurophysiology of Speech
Ting-Shuo Yo
 
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
Ting-Shuo Yo
 
Simulating Weather: Numerical Weather Prediction as Computational Simulation
Simulating Weather: Numerical Weather Prediction as Computational SimulationSimulating Weather: Numerical Weather Prediction as Computational Simulation
Simulating Weather: Numerical Weather Prediction as Computational Simulation
Ting-Shuo Yo
 

More from Ting-Shuo Yo (9)

20141030 ntustme computer_programmingandbeyond_share
20141030 ntustme computer_programmingandbeyond_share20141030 ntustme computer_programmingandbeyond_share
20141030 ntustme computer_programmingandbeyond_share
 
Tag2Card User's manual v01
Tag2Card User's manual v01Tag2Card User's manual v01
Tag2Card User's manual v01
 
InnoCentive
InnoCentiveInnoCentive
InnoCentive
 
Introduction to BCI
Introduction to BCIIntroduction to BCI
Introduction to BCI
 
Design Thinking and Innovation
Design Thinking and InnovationDesign Thinking and Innovation
Design Thinking and Innovation
 
Neighborhood Component Analysis 20071108
Neighborhood Component Analysis 20071108Neighborhood Component Analysis 20071108
Neighborhood Component Analysis 20071108
 
The Neurophysiology of Speech
The Neurophysiology of SpeechThe Neurophysiology of Speech
The Neurophysiology of Speech
 
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
Diffusion MRI, Tractography,and Connectivity: what machine learning can do?
 
Simulating Weather: Numerical Weather Prediction as Computational Simulation
Simulating Weather: Numerical Weather Prediction as Computational SimulationSimulating Weather: Numerical Weather Prediction as Computational Simulation
Simulating Weather: Numerical Weather Prediction as Computational Simulation
 

Recently uploaded

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 

Recently uploaded (20)

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 

A Comparison of Evaluation Methods in Coevolution 20070921

  • 1. Final Presentation INF/SCR-06-54 Applied Computing Science, ICS A Comparison of Evaluation Methods in Coevolution Ting-Shuo Yo Supervisor: Edwin D. de Jong Arno P.J.M. Siebes
  • 2. Outline ● Introduction ● Evaluation methods in coevolution ● Performance measures ● Test problems ● Results and discussion ● Concluding remarks
  • 3. Introduction ● Evolutionary computation ● Coevolution ● Coevolution for test-based problems ● Motivation of this study
  • 4. Genetic Algorithm Initialization 2. SELECTION Parents 1. EVALUATION 3. REPRODUCTION Population (crossover, mutation,...) 4. REPLACEMENT Offspring While (not TERMINATE) TERMINATE End
  • 5. Coevolution Initialization 1. EVALUATION ................ Subpopulation Subpopulation 2. SELECTION 2. SELECTION 3. REPRODUCTION 3. REPRODUCTION 4. REPLACEMENT 4. REPLACEMENT While (not TERMINATE) TERMINATE End
  • 6. Test-Based Problems f(x) original function regression curve s1 s2 s3 x t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
  • 7. Coevolution for Test-Based Problems Test 1. EVALUATION population Interaction: 2. SELECTION ● Does the solution solve the 3. REPRODUCTION test? 4. REPLACEMENT ● How good does the solution perform on the test? Solution population Solutions: the more tests it solves the better. 2. SELECTION 3. REPRODUCTION Tests: the less solutions pass it 4. REPLACEMENT the better.
  • 8. Motivation ● Coevolution provides a way to select tests adaptively → stability and efficiency ● Solution concept → stability ● Efficiency depends on selection and evaluation. ● Compared to evaluation based on all relevant information, how do different coevolutionary evaluation methods perform?
  • 9. Concepts for Coevolutionary Evaluation Methods ● Interaction ● Distinction and informativeness ● Dominance and multi-objective approach
  • 10. Interaction ● A function that returns the outcome of interaction between two individuals from different subpopulations. – Checkers players: which one wins – Test / Solution: if the solution succeeds in solving the test S 1 S 2 S 3 S 4 S 5 sum T1 0 1 0 0 1 2 T2 0 0 1 1 0 2 ● Interaction matrix T3 0 1 1 0 0 2 T4 1 0 0 0 0 1 T5 1 0 1 0 0 2 sum 2 2 3 1 1
  • 11. Distinction Solutions T3 S1 S2 S3 S4 S5 sum S1 S2 S3 S4 S5 sum T1 0 1 0 0 1 2 S1 - 0 0 0 0 T2 0 0 1 1 0 2 S2 1 - 0 1 1 Test T3 0 1 1 0 0 2 S3 1 0 - 1 1 cases T4 1 0 0 0 0 1 S4 0 0 0 - 0 T5 1 0 1 0 0 2 S5 0 0 0 0 - sum 2 2 3 1 1 sum 2 0 0 2 2 6 ● Ability to keep diversity on the other subpopulation. ● Informativeness
  • 12. Dominance and MO approach f2 non-dominated S1 is dominated by S2 iff: dominated f1 ● Keep the best for each objective. ● MO: number of individuals that dominate it
  • 13. Evaluation Methods ● AS: Averaged Score ● WS: Weighted Score ● AI: Averaged Informativeness ● WI: Weighted Informativeness ● MO
  • 14. AS and WS ● AS : (# positive interaction) / (# all interaction) Solutions S1 S2 S3 S4 S5 sum T1 0 1 0 0 1 2 0.4 T2 0 0 1 1 0 2 0.4 Test T3 0 1 1 0 0 2 0.4 cases T4 1 0 0 0 0 1 0.2 T5 1 0 1 0 0 2 0.4 sum 2 2 3 1 1 0.4 0.4 0.6 0.2 0.2 ● WS : each interaction is weighted differently.
  • 15. AI and WI ● AI : # of distinctions it makes ● WI : each distinction is weighted differently. S1>S2 S1>S3 S1>S4 S1>S5 ............. T1 1 1 0 1 .... 5 T2 0 0 0 1 .... 2 T3 1 1 0 0 .... 6 T4 0 1 0 1 .... 2 T5 0 0 0 0 .... 1 In the algorithm actually a weighted summation of AS and informativeness is used. 0.3 x informativeness + 0.7 x AS
  • 16. MO ● Objectives : each individual in the other subpopulation. ● MO: number of individuals that dominate it. f2 non-dominated ● Non-dominated individuals dominated have the highest fitness value. f1
  • 17. Performance Measures ● Objective Fitness (OF) – Evaluation against a fix set of test cases – Here we use "all possible test cases" since we have picked problems with small sizes. ● Objective Fitness Correlation (OFC) – Correlation between OFs and fitness values in the coevolution (subjective fitness, SF).
  • 18. Experimental Setup ● Controlled experiments: GAAS – GA with AS from exhaustive evaluation. ● Compare the OF based on the same number of interactions.
  • 19. Test Problems ● Majority Function Problem (MFP) – 1D cellular automata problem – Two parameters: radius (r) and problem size (n) A sample IC with n = 9 0 1 0 1 0 0 1 1 1 neighbor bits target bit Input 000 001 010 011 100 101 110 111 A sample rule with r = 1 Output 0 0 0 1 0 1 1 1 boolean-vector representation of this rule
  • 20. Test Problems ● Majority Function Problem (MFP)
  • 21. Test Problems ● Symbolic Regression Problem (SRP) – Curve fitting with Genetic Programming trees – Two measures: sum of error and hit + f(x) original function GP Tree regression curve hit * + - x x x x x 2x x
  • 22. Test Problems ● Parity Problem (PP) – Determine odd/even for the number of 1's in a bit string – Two parameter: odd/even and bit string length (n) A problem with n = 10 0 1 0 1 0 0 1 1 11 A solution tree
  • 23. Test Problems: PP 5-even Parity Boolean-vector 0 0 0 1 0 false (0) D0 D1 D2 D3 D4 0 AND false GP Tree 1 0 OR AND 1 1 0 NOT AND D2 NOT OR AND 0 D0 D3 D0 D1 D1 D2 0 1 0 0 0 0
  • 24. Results of MFP (r=2, n=9)
  • 25. Results of MFP (r=2, n=9)
  • 26. Results of SRP 6 4 x −2x x 2
  • 27. Results of SRP 6 4 x −2x x 2
  • 28. Results of PP (odd, n=10)
  • 29. Results of PP (odd, n=10)
  • 31. Multi-objective Approach ● One run for COMO in MFP. ● OF drops when NDR rises. ● Why high NDR? – Duplicate solutions – Too many objectives
  • 32. MO approach to improve WI MFP MO-WS-WI
  • 33. MO approach to improve WI SRP MO-AS-WI MO-WS-WI WeiSum-AS-WI MO-AS-AI MO-WS-AI
  • 34. MO approach to improve WI PP MO-AS-WI
  • 35. Conclusions ● MO2 approach with weighted informativeness (MO-AS-WI and MO-WS-WI) outperforms other evaluation methods in coevolution. ● MO1 approach does not work well because there are usually too many objectives. This can be represented by a high NDR and results in a random search. ● Coevolution is efficient for the MFP and SRP.
  • 36. Issues ● Test problems used are small, and there is not proof of generalizability to larger problems. ● Implication to statistical learning: select not only difficult but also informative data for training.
  • 39. Average Score Solutions S1 S2 S3 S4 S5 T1 0 1 0 0 1 2 0.4 T2 0 0 1 1 0 2 0.4 Test T3 0 1 1 0 0 2 0.4 cases T4 1 0 0 0 0 2 0.4 T5 1 0 1 0 0 2 0.4 2 2 3 1 1 0.4 0.4 0.6 0.2 0.2 Max(O(m),O(n))
  • 40. Weighted Score Solutions S1 S2 S3 S4 S5 T1 0 1 0 0 1 2 T2 0 0 1 1 0 2 Test T3 0 1 1 0 0 2 cases T4 1 0 0 0 0 2 T5 1 0 1 0 0 2 2 2 3 1 1 Max(O(m),O(n))
  • 41. Average Informativeness Max(O(mn2),O(nm2))
  • 42. Weighted Informativeness Max(O(mn2),O(nm2))
  • 43. MO Max(O(mn2),O(nm2))