SlideShare a Scribd company logo
1 of 23
OneStepFitsAllFittedQIteration with XCS Daniele Loiacono
XCS(F) in multistep problems XCS(F) was successfully applied to complex and largesingle step problems. In contrast, even rather simplemultistep problems might be very challenging for XCS(F) Connections with methods of generalized reinforcement learning have been widely studied and so common issues: over-generalization unstable learning process  divergence (with computed prediction) Advanced prediction mechanisms (e.g. Tile Coding) generally help but do not provide any guarantee
XCS(F) searches for the best generalizations in the problem space Generalizations might prevent from learning the optimal payoff landscape The payoff landscape learned affects the search for generalizations in the problem space
What is this talk about? We introduce an alternative approach to multistep problems based on Fitted Q Iteration involving a sequence of single step problems We will show only some preliminary results to test the presented approach Agenda Fitted Q Iteration Fitted Q Iteration + XCS Preliminary Results Discussion Future Works
Fitted Q Iteration (Ernst et al., 2005) Qi(s,a) rt+1 Agent st delay at st+1 Problem {<st,at,rt+1,st+1>} Learner
Fitted Q Iteration (Ernst et al., 2005) Q1(s,a) Q2(s,a) QL(s,a) … {<st,at,rt+1,st+1>}
Fitted Q Iteration +XCS XCS is applied to the target multistep problem The interaction between XCS and the problem is sampled A sequence of single step regression problems is generated the state is the concatenation of the state and the action of the original multistep problem no actions training set is built for all the <st,at> pairs collected test set is built for all the <st+1,-> collected XCS is applied iteratively to each single step problem generated  Qi(s,a) is computed as the system prediction on the test set
Experimental Design Woods 14  Woods 1 Maze 5 Maze 6
Experimental Results: Woods 1 XCS + Sampling for 50 problems
Experimental Results: Woods 1
Experimental Results: Maze 5 XCS + Sampling for 25 problems
Experimental Results: Maze 5
Experimental Results: Maze 6 XCS + Sampling for 15 problems
Experimental Results: Maze 6
Experimental Results: Woods 14 XCS + Sampling for 15 problems
Experimental Results: Woods 14
Discussion Fitted Q Iteration + XCS offers several advantages efficient learning generalization over the action space However… no real-time learning assumes a static environment how to perform a good problem space sampling and how does it affect the performance? how does XCS compares to other supervised learning techniques in this task?
Future Works Integrating  Fitted Q-Iteration and XCS in an incremental/iterated fashion  Test on more challenging problems that requires  generalization (e.g., Butz and Lanzi, 2010)  Investigate sampling strategies Extends XCS based on some principles of Fitted Q Iteration?
Some hints about problem sampling
Some hints about problem sampling
Some hints about problem sampling
Results of a bad sampling on Woods 1
Results of a bad sampling on Woods 1

More Related Content

Similar to One Step Fits All

1004_theorem_proving_2018.pptx on the to
1004_theorem_proving_2018.pptx on the to1004_theorem_proving_2018.pptx on the to
1004_theorem_proving_2018.pptx on the to
fariyaPatel
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
butest
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
butest
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
Moshe Vardi
 
Dipso K Mi
Dipso K MiDipso K Mi
Dipso K Mi
msabou
 

Similar to One Step Fits All (20)

CS415 - Lecture 11 - CSPs I.pptx
CS415 - Lecture 11 - CSPs I.pptxCS415 - Lecture 11 - CSPs I.pptx
CS415 - Lecture 11 - CSPs I.pptx
 
1004_theorem_proving_2018.pptx on the to
1004_theorem_proving_2018.pptx on the to1004_theorem_proving_2018.pptx on the to
1004_theorem_proving_2018.pptx on the to
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
 
MetiTarski's menagerie of cooperating systems
MetiTarski's menagerie of cooperating systemsMetiTarski's menagerie of cooperating systems
MetiTarski's menagerie of cooperating systems
 
The Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and BackThe Automated-Reasoning Revolution: from Theory to Practice and Back
The Automated-Reasoning Revolution: from Theory to Practice and Back
 
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Combining Committee-Based Semi-supervised and Active Learning and Its Applica...
Combining Committee-Based Semi-supervised and Active Learning and Its Applica...Combining Committee-Based Semi-supervised and Active Learning and Its Applica...
Combining Committee-Based Semi-supervised and Active Learning and Its Applica...
 
TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free TMPA-2017: Generating Cost Aware Covering Arrays For Free
TMPA-2017: Generating Cost Aware Covering Arrays For Free
 
Ensembles_Unit_IV.ppt
Ensembles_Unit_IV.pptEnsembles_Unit_IV.ppt
Ensembles_Unit_IV.ppt
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
 
A General Purpose Exact Solution Method For Mixed Integer Concave Minimizatio...
A General Purpose Exact Solution Method For Mixed Integer Concave Minimizatio...A General Purpose Exact Solution Method For Mixed Integer Concave Minimizatio...
A General Purpose Exact Solution Method For Mixed Integer Concave Minimizatio...
 
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
Sudoku
SudokuSudoku
Sudoku
 
ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016
 
Dipso K Mi
Dipso K MiDipso K Mi
Dipso K Mi
 

More from Daniele Loiacono

2010 Simulated Car Racing Championship @ CIG-2010
2010 Simulated Car Racing Championship @ CIG-20102010 Simulated Car Racing Championship @ CIG-2010
2010 Simulated Car Racing Championship @ CIG-2010
Daniele Loiacono
 
2010 Simulated Car Racing Championship @ GECCO-2010
2010 Simulated Car Racing Championship @ GECCO-20102010 Simulated Car Racing Championship @ GECCO-2010
2010 Simulated Car Racing Championship @ GECCO-2010
Daniele Loiacono
 
2010 Simulated Car Racing Championship @ WCCI-2010
2010 Simulated Car Racing Championship @ WCCI-20102010 Simulated Car Racing Championship @ WCCI-2010
2010 Simulated Car Racing Championship @ WCCI-2010
Daniele Loiacono
 

More from Daniele Loiacono (20)

GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013GPUs for GEC Competition @ GECCO-2013
GPUs for GEC Competition @ GECCO-2013
 
EvoRobocode Competition @ GECCO-2013
EvoRobocode Competition @ GECCO-2013EvoRobocode Competition @ GECCO-2013
EvoRobocode Competition @ GECCO-2013
 
2013 Simulated Car Racing @ GECCO-2013
2013 Simulated Car Racing @ GECCO-20132013 Simulated Car Racing @ GECCO-2013
2013 Simulated Car Racing @ GECCO-2013
 
2012 Simulated Car Racing Championship @ CIG-2012
2012 Simulated Car Racing Championship @ CIG-20122012 Simulated Car Racing Championship @ CIG-2012
2012 Simulated Car Racing Championship @ CIG-2012
 
2012 Simulated Car Racing Championship @ GECCO-2012
2012 Simulated Car Racing Championship @ GECCO-20122012 Simulated Car Racing Championship @ GECCO-2012
2012 Simulated Car Racing Championship @ GECCO-2012
 
2012 Simulated Car Racing Championship @ Evo*-2012
2012 Simulated Car Racing Championship @ Evo*-20122012 Simulated Car Racing Championship @ Evo*-2012
2012 Simulated Car Racing Championship @ Evo*-2012
 
Computational Intelligence in Games Tutorial @GECCO2012
Computational Intelligence in Games Tutorial @GECCO2012Computational Intelligence in Games Tutorial @GECCO2012
Computational Intelligence in Games Tutorial @GECCO2012
 
XCSF with Local Deletion: Preventing Detrimental Forgetting
XCSF with Local Deletion: Preventing Detrimental ForgettingXCSF with Local Deletion: Preventing Detrimental Forgetting
XCSF with Local Deletion: Preventing Detrimental Forgetting
 
Testing learning classifier systems
Testing learning classifier systemsTesting learning classifier systems
Testing learning classifier systems
 
Random Artificial Incorporation of Noise in a Learning Classifier System Envi...
Random Artificial Incorporation of Noise in a Learning Classifier System Envi...Random Artificial Incorporation of Noise in a Learning Classifier System Envi...
Random Artificial Incorporation of Noise in a Learning Classifier System Envi...
 
Introducing LCS to Digital Design Verification
Introducing LCS to Digital Design VerificationIntroducing LCS to Digital Design Verification
Introducing LCS to Digital Design Verification
 
A temporal classifier system using spiking neural networks
A temporal classifier system using spiking neural networksA temporal classifier system using spiking neural networks
A temporal classifier system using spiking neural networks
 
Confusion Matrices for Improving Performance of Feature Pattern Classifier Sy...
Confusion Matrices for Improving Performance of Feature Pattern Classifier Sy...Confusion Matrices for Improving Performance of Feature Pattern Classifier Sy...
Confusion Matrices for Improving Performance of Feature Pattern Classifier Sy...
 
Automatically Defined Functions for Learning Classifier Systems
Automatically Defined Functions for Learning Classifier SystemsAutomatically Defined Functions for Learning Classifier Systems
Automatically Defined Functions for Learning Classifier Systems
 
Voting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label ClassificationVoting Based Learning Classifier System for Multi-Label Classification
Voting Based Learning Classifier System for Multi-Label Classification
 
2011 Simulated Car Racing Championship @ GECCO-2011
2011 Simulated Car Racing Championship @ GECCO-20112011 Simulated Car Racing Championship @ GECCO-2011
2011 Simulated Car Racing Championship @ GECCO-2011
 
2010 Simulated Car Racing Championship @ CIG-2010
2010 Simulated Car Racing Championship @ CIG-20102010 Simulated Car Racing Championship @ CIG-2010
2010 Simulated Car Racing Championship @ CIG-2010
 
2010 Simulated Car Racing Championship @ GECCO-2010
2010 Simulated Car Racing Championship @ GECCO-20102010 Simulated Car Racing Championship @ GECCO-2010
2010 Simulated Car Racing Championship @ GECCO-2010
 
2010 Simulated Car Racing Championship @ WCCI-2010
2010 Simulated Car Racing Championship @ WCCI-20102010 Simulated Car Racing Championship @ WCCI-2010
2010 Simulated Car Racing Championship @ WCCI-2010
 
Car Setup Optimization Competition @ EvoStar 2010
Car Setup Optimization Competition @ EvoStar 2010Car Setup Optimization Competition @ EvoStar 2010
Car Setup Optimization Competition @ EvoStar 2010
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

One Step Fits All

  • 2. XCS(F) in multistep problems XCS(F) was successfully applied to complex and largesingle step problems. In contrast, even rather simplemultistep problems might be very challenging for XCS(F) Connections with methods of generalized reinforcement learning have been widely studied and so common issues: over-generalization unstable learning process divergence (with computed prediction) Advanced prediction mechanisms (e.g. Tile Coding) generally help but do not provide any guarantee
  • 3. XCS(F) searches for the best generalizations in the problem space Generalizations might prevent from learning the optimal payoff landscape The payoff landscape learned affects the search for generalizations in the problem space
  • 4. What is this talk about? We introduce an alternative approach to multistep problems based on Fitted Q Iteration involving a sequence of single step problems We will show only some preliminary results to test the presented approach Agenda Fitted Q Iteration Fitted Q Iteration + XCS Preliminary Results Discussion Future Works
  • 5. Fitted Q Iteration (Ernst et al., 2005) Qi(s,a) rt+1 Agent st delay at st+1 Problem {<st,at,rt+1,st+1>} Learner
  • 6. Fitted Q Iteration (Ernst et al., 2005) Q1(s,a) Q2(s,a) QL(s,a) … {<st,at,rt+1,st+1>}
  • 7. Fitted Q Iteration +XCS XCS is applied to the target multistep problem The interaction between XCS and the problem is sampled A sequence of single step regression problems is generated the state is the concatenation of the state and the action of the original multistep problem no actions training set is built for all the <st,at> pairs collected test set is built for all the <st+1,-> collected XCS is applied iteratively to each single step problem generated Qi(s,a) is computed as the system prediction on the test set
  • 8. Experimental Design Woods 14 Woods 1 Maze 5 Maze 6
  • 9. Experimental Results: Woods 1 XCS + Sampling for 50 problems
  • 11. Experimental Results: Maze 5 XCS + Sampling for 25 problems
  • 13. Experimental Results: Maze 6 XCS + Sampling for 15 problems
  • 15. Experimental Results: Woods 14 XCS + Sampling for 15 problems
  • 17. Discussion Fitted Q Iteration + XCS offers several advantages efficient learning generalization over the action space However… no real-time learning assumes a static environment how to perform a good problem space sampling and how does it affect the performance? how does XCS compares to other supervised learning techniques in this task?
  • 18. Future Works Integrating Fitted Q-Iteration and XCS in an incremental/iterated fashion Test on more challenging problems that requires generalization (e.g., Butz and Lanzi, 2010) Investigate sampling strategies Extends XCS based on some principles of Fitted Q Iteration?
  • 19. Some hints about problem sampling
  • 20. Some hints about problem sampling
  • 21. Some hints about problem sampling
  • 22. Results of a bad sampling on Woods 1
  • 23. Results of a bad sampling on Woods 1