SlideShare a Scribd company logo
1 of 35
Comparing the Parallel Automatic Composition of Inductive Applications  with Stacking Methods Hidenao Abe  & Takahira Yamaguchi  Shizuoka University, JAPAN {hidenao,yamaguti}@ks.cs.inf.shizuoka.ac.jp
Contents ,[object Object],[object Object],[object Object],[object Object]
Overview of meta-learning scheme Meta-Learning Executor Meta Knowledge A Better result to the given data set than its done  by each single learning algorithm Meta-learning algorithm 学習 学習 学習 Learning Algorithms データセット Data Set
Selective meta-learning scheme ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],They don’t work well,  when no base-level learning algorithm works well to the given data set !! -> Because they don’t de-compose base-level learning algorithms.
Contents ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Basic Idea of our Constructive Meta-Learning Analysis of Representative Inductive Learning Algorithms De-composition& Organization Search & Composition + C4.5 VS CS AQ15
Basic Idea of Constructive Meta-Learning Analysis of Representative Inductive Learning Algorithms CS AQ15 De-composition& Organization Search & Composition +
Basic Idea of Constructive Meta-Learning Analysis of Representative  Inductive Learning Algorithms Automatic Composition  of Inductive Applications   De-composition& Organization Search & Composition + Organizing inductive learning methods, control structures and treated objects.
Analysis of Representative Inductive Learning Algorithms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],We have analyzed the following 8 learning algorithms: We have identified 22 specific inductive learning methods.
Inductive Learning Method Repository
Data Type Hierarchy Organization of input/output/reference data types for inductive learning methods  classifier-set data-set If-Then rules tree (Decision Tree) network (Neural Net) training data set test data set Objects validation data set
Identifying Inductive Learning Methods and Control Structures Generating Training &  Validation Data Sets Generating a classifier  set Evaluating  a classifier set Modifying  training and  validation data sets Modifying  a classifier set Evaluating classifier sets to test  data set Modifying  a classifier set
CAMLET: a Computer Aided Machine Learning Engineering Tool CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Construction Instantiation Go to  the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Refinement User
CAMLET with a parallel environment CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to  the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User
Parallel Executions of Inductive Applications Constructing specs Sending specs   Initializing R: receiving  , Ex: executing, S: sending Composition Level is necessary one process Execution Level is necessary one or more process elements R R R R Waiting Executing Ex Ex Ex Ex Receiving a result Receiving Ex Ex S Ex Refining the spec Sending refined one Refining & Sending Ex Ex Ex R
Refinement of Inductive Applications with GA t  Generation Selection Parents Crossover and Mutation Executed Specs & Their Results X  Generation execute & add ,[object Object],[object Object],[object Object],[object Object],[object Object],t+1  Generation Children
CAMLET on a parallel GA refinement CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to  the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User
Contents ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Set up of accuracy comparison with common data sets ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Statlog common data sets To generate 10-fold cross validation data sets’ pairs, We have used SplitDatasetFilter implemented in WEKA.
Setting up of Stacking methods to Statlog common data sets J4.8 (with pruning,  C=0.25 ) IBk ( k=5 ) Bagged J4.8 (Sampling rate=55%, 5 Iterations) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (5 Iterations) Boosted J4.8 (10 Iterations) Part (with pruning,  C=0.25 ) Naïve Bayes J4.8  (with pruning,  C=0.25 ) Naïve Bayes Classification with Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms
Evaluation on accuracies to Statlog common data sets Inductive applications composed by CAMLET have shown us as good performance as that of stacking with Classification via Linear Regression.
Setting up of Stacking methods to UCI common data sets J4.8 (with pruning,  C=0.25 ) IBk ( k=5 ) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (10 Iterations) Part (with pruning,  C=0.25 ) Naïve Bayes J4.8  (with pruning,  C=0.25 ) Naïve Bayes Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms
Evaluation on accuracies to UCI common data sets 83.23 CAMLET 81.92 J4.8 83.79 81.07 81.97 Average Max. of 3 NB LR Algorithm
Analysis of parallel efficiency of CAMLET ,[object Object],[object Object],[object Object],1<= x <100,  n :#fold,  e :Execution Time of Each PE
Parallel Efficiency of the case study with StatLog data sets Parallel Efficiencies of 10-fold CV data sets Parallel Efficiencies of training-test data sets
Why the parallel efficiency of “satimage” data set was so low? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Contents ,[object Object],[object Object],[object Object],[object Object]
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank you!
Stacking: training and prediction Training phase Prediction phase Training Set Training Set (Int.) Test Set (Int.) Training Set Test Set Meta-level Training Set Meta-level classifier Meta-level learning Learning of Base-level classifiers Transformation Repeating with CV Meta-level Test Set Transformation Prediction Results of Preduction Meta-level classifiers Learning of Base-level classifiers ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers
How to transforme base-level attributes to meta-level attributes Classifier by Algorithm 1 Learning with Algorithm 1 ,[object Object],[object Object],[object Object],A  algorithm C  class values #Meta-level Att. =  AC
Meta-level classifiers (Ex.) The prediction probability of”0” With classifier by algorithm 1 The prediction probability of”1” With classifier by algorithm 2 The prediction probability of”1” With classifier by algorithm 1 The prediction probability of”0” With classifier by algorithm 3 Prediction “ 1” .... ≦0.1 >0.1 ≦0.95 >0.95 >0.9 ≦0.9 Prediction “ 0” Prediction “ 1”
Accuracies of 8 base-level learning algorithms to Statlog data sets Stacking Base-level algorithms Accuracy(%) CAMLET Base-level algorithms Accuracy(%)
C4.5 Decision Tree without pruning  (reproduction of CAMLET) Generating training and validation data sets with a void validation set Generating a classifier set (decision tree) with entropy + information ratio  Evaluating a classifier set with set evaluation Evaluating classifier sets to test data set with single classifier set evaluation 1. Select just one control structure 2. Fill each method with specific methods 3. Instantiate this spec. 4. Compile the spec. 5. Execute its executable code 6. Refine the spec.  (if needed)

More Related Content

What's hot

Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Matthias Braunhofer
 
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
Waqas Tariq
 
Parallel Tuning of Machine Learning Algorithms, Thesis Proposal
Parallel Tuning of Machine Learning Algorithms, Thesis ProposalParallel Tuning of Machine Learning Algorithms, Thesis Proposal
Parallel Tuning of Machine Learning Algorithms, Thesis Proposal
Gianmario Spacagna
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
Gon-soo Moon
 
Query aware determinization of uncertain
Query aware determinization of uncertainQuery aware determinization of uncertain
Query aware determinization of uncertain
Shakas Technologies
 

What's hot (20)

Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
 
Ml7 bagging
Ml7 baggingMl7 bagging
Ml7 bagging
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
 
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
Improving Spam Mail Filtering Using Classification Algorithms With Partition ...
 
Branch And Bound and Beam Search Feature Selection Algorithms
Branch And Bound and Beam Search Feature Selection AlgorithmsBranch And Bound and Beam Search Feature Selection Algorithms
Branch And Bound and Beam Search Feature Selection Algorithms
 
Automating System Test Case Classification and Prioritization for Use Case-Dr...
Automating System Test Case Classification and Prioritization for Use Case-Dr...Automating System Test Case Classification and Prioritization for Use Case-Dr...
Automating System Test Case Classification and Prioritization for Use Case-Dr...
 
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
Reconstruction of a Complete Dataset from an Incomplete Dataset by ARA (Attri...
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Parallel Tuning of Machine Learning Algorithms, Thesis Proposal
Parallel Tuning of Machine Learning Algorithms, Thesis ProposalParallel Tuning of Machine Learning Algorithms, Thesis Proposal
Parallel Tuning of Machine Learning Algorithms, Thesis Proposal
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitization
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
 
RapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid MinerRapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Top 10 Data Science Practioner Pitfalls - Mark Landry
Top 10 Data Science Practioner Pitfalls - Mark LandryTop 10 Data Science Practioner Pitfalls - Mark Landry
Top 10 Data Science Practioner Pitfalls - Mark Landry
 
RapidMiner: Learning Schemes In Rapid Miner
RapidMiner:  Learning Schemes In Rapid MinerRapidMiner:  Learning Schemes In Rapid Miner
RapidMiner: Learning Schemes In Rapid Miner
 
Query aware determinization of uncertain
Query aware determinization of uncertainQuery aware determinization of uncertain
Query aware determinization of uncertain
 

Viewers also liked

COST2102_Chapter_Bartlett.doc
COST2102_Chapter_Bartlett.docCOST2102_Chapter_Bartlett.doc
COST2102_Chapter_Bartlett.doc
butest
 
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
butest
 
An introduc on to Machine Learning
An introduc on to Machine LearningAn introduc on to Machine Learning
An introduc on to Machine Learning
butest
 
Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...
butest
 
Webpage Design-eCommerce
Webpage Design-eCommerceWebpage Design-eCommerce
Webpage Design-eCommerce
butest
 
The Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine LearningThe Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine Learning
butest
 
Powerpoint
PowerpointPowerpoint
Powerpoint
butest
 
CS 898O : Machine Learning
CS 898O : Machine LearningCS 898O : Machine Learning
CS 898O : Machine Learning
butest
 
Izobrazevanje za data-mining
Izobrazevanje za data-miningIzobrazevanje za data-mining
Izobrazevanje za data-mining
butest
 
Machine Learning applied to Go
Machine Learning applied to GoMachine Learning applied to Go
Machine Learning applied to Go
butest
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
butest
 
Improving Machine Learning Approaches to Coreference Resolution
Improving Machine Learning Approaches to Coreference ResolutionImproving Machine Learning Approaches to Coreference Resolution
Improving Machine Learning Approaches to Coreference Resolution
butest
 
PPT SLIDES
PPT SLIDESPPT SLIDES
PPT SLIDES
butest
 
Slide 1
Slide 1Slide 1
Slide 1
butest
 
Introduction.doc
Introduction.docIntroduction.doc
Introduction.doc
butest
 
Fast and Effective Worm Fingerprinting via Machine Learning
Fast and Effective Worm Fingerprinting via Machine LearningFast and Effective Worm Fingerprinting via Machine Learning
Fast and Effective Worm Fingerprinting via Machine Learning
butest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
butest
 
Machine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent MicroworldsMachine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent Microworlds
butest
 

Viewers also liked (20)

COST2102_Chapter_Bartlett.doc
COST2102_Chapter_Bartlett.docCOST2102_Chapter_Bartlett.doc
COST2102_Chapter_Bartlett.doc
 
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
 
An introduc on to Machine Learning
An introduc on to Machine LearningAn introduc on to Machine Learning
An introduc on to Machine Learning
 
Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...
 
Webpage Design-eCommerce
Webpage Design-eCommerceWebpage Design-eCommerce
Webpage Design-eCommerce
 
The Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine LearningThe Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine Learning
 
Powerpoint
PowerpointPowerpoint
Powerpoint
 
CS 898O : Machine Learning
CS 898O : Machine LearningCS 898O : Machine Learning
CS 898O : Machine Learning
 
Izobrazevanje za data-mining
Izobrazevanje za data-miningIzobrazevanje za data-mining
Izobrazevanje za data-mining
 
Machine Learning applied to Go
Machine Learning applied to GoMachine Learning applied to Go
Machine Learning applied to Go
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
 
Improving Machine Learning Approaches to Coreference Resolution
Improving Machine Learning Approaches to Coreference ResolutionImproving Machine Learning Approaches to Coreference Resolution
Improving Machine Learning Approaches to Coreference Resolution
 
PPT SLIDES
PPT SLIDESPPT SLIDES
PPT SLIDES
 
Slide 1
Slide 1Slide 1
Slide 1
 
Introduction.doc
Introduction.docIntroduction.doc
Introduction.doc
 
Fast and Effective Worm Fingerprinting via Machine Learning
Fast and Effective Worm Fingerprinting via Machine LearningFast and Effective Worm Fingerprinting via Machine Learning
Fast and Effective Worm Fingerprinting via Machine Learning
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Machine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent MicroworldsMachine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent Microworlds
 
A Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text SummarizationA Survey of Various Methods for Text Summarization
A Survey of Various Methods for Text Summarization
 
Professional practice 10
Professional practice 10Professional practice 10
Professional practice 10
 

Similar to Presentation

Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
MLAI2
 
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
Editor IJARCET
 
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
Editor IJARCET
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
ESCOM
 

Similar to Presentation (20)

Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
 
FINAL REVIEW
FINAL REVIEWFINAL REVIEW
FINAL REVIEW
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
 
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
 
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Nose Dive into Apache Spark ML
Nose Dive into Apache Spark MLNose Dive into Apache Spark ML
Nose Dive into Apache Spark ML
 
PythonML.pptx
PythonML.pptxPythonML.pptx
PythonML.pptx
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
weka data mining
weka data mining weka data mining
weka data mining
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Presentation

  • 1. Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN {hidenao,yamaguti}@ks.cs.inf.shizuoka.ac.jp
  • 2.
  • 3. Overview of meta-learning scheme Meta-Learning Executor Meta Knowledge A Better result to the given data set than its done by each single learning algorithm Meta-learning algorithm 学習 学習 学習 Learning Algorithms データセット Data Set
  • 4.
  • 5.
  • 6. Basic Idea of our Constructive Meta-Learning Analysis of Representative Inductive Learning Algorithms De-composition& Organization Search & Composition + C4.5 VS CS AQ15
  • 7. Basic Idea of Constructive Meta-Learning Analysis of Representative Inductive Learning Algorithms CS AQ15 De-composition& Organization Search & Composition +
  • 8. Basic Idea of Constructive Meta-Learning Analysis of Representative Inductive Learning Algorithms Automatic Composition of Inductive Applications De-composition& Organization Search & Composition + Organizing inductive learning methods, control structures and treated objects.
  • 9.
  • 11. Data Type Hierarchy Organization of input/output/reference data types for inductive learning methods classifier-set data-set If-Then rules tree (Decision Tree) network (Neural Net) training data set test data set Objects validation data set
  • 12. Identifying Inductive Learning Methods and Control Structures Generating Training & Validation Data Sets Generating a classifier set Evaluating a classifier set Modifying training and validation data sets Modifying a classifier set Evaluating classifier sets to test data set Modifying a classifier set
  • 13. CAMLET: a Computer Aided Machine Learning Engineering Tool CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Refinement User
  • 14. CAMLET with a parallel environment CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User
  • 15. Parallel Executions of Inductive Applications Constructing specs Sending specs Initializing R: receiving , Ex: executing, S: sending Composition Level is necessary one process Execution Level is necessary one or more process elements R R R R Waiting Executing Ex Ex Ex Ex Receiving a result Receiving Ex Ex S Ex Refining the spec Sending refined one Refining & Sending Ex Ex Ex R
  • 16.
  • 17. CAMLET on a parallel GA refinement CAMLET Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User
  • 18.
  • 19.
  • 20. Statlog common data sets To generate 10-fold cross validation data sets’ pairs, We have used SplitDatasetFilter implemented in WEKA.
  • 21. Setting up of Stacking methods to Statlog common data sets J4.8 (with pruning, C=0.25 ) IBk ( k=5 ) Bagged J4.8 (Sampling rate=55%, 5 Iterations) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (5 Iterations) Boosted J4.8 (10 Iterations) Part (with pruning, C=0.25 ) Naïve Bayes J4.8 (with pruning, C=0.25 ) Naïve Bayes Classification with Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms
  • 22. Evaluation on accuracies to Statlog common data sets Inductive applications composed by CAMLET have shown us as good performance as that of stacking with Classification via Linear Regression.
  • 23. Setting up of Stacking methods to UCI common data sets J4.8 (with pruning, C=0.25 ) IBk ( k=5 ) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (10 Iterations) Part (with pruning, C=0.25 ) Naïve Bayes J4.8 (with pruning, C=0.25 ) Naïve Bayes Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms
  • 24. Evaluation on accuracies to UCI common data sets 83.23 CAMLET 81.92 J4.8 83.79 81.07 81.97 Average Max. of 3 NB LR Algorithm
  • 25.
  • 26. Parallel Efficiency of the case study with StatLog data sets Parallel Efficiencies of 10-fold CV data sets Parallel Efficiencies of training-test data sets
  • 27.
  • 28.
  • 29.
  • 31. Stacking: training and prediction Training phase Prediction phase Training Set Training Set (Int.) Test Set (Int.) Training Set Test Set Meta-level Training Set Meta-level classifier Meta-level learning Learning of Base-level classifiers Transformation Repeating with CV Meta-level Test Set Transformation Prediction Results of Preduction Meta-level classifiers Learning of Base-level classifiers ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers
  • 32.
  • 33. Meta-level classifiers (Ex.) The prediction probability of”0” With classifier by algorithm 1 The prediction probability of”1” With classifier by algorithm 2 The prediction probability of”1” With classifier by algorithm 1 The prediction probability of”0” With classifier by algorithm 3 Prediction “ 1” .... ≦0.1 >0.1 ≦0.95 >0.95 >0.9 ≦0.9 Prediction “ 0” Prediction “ 1”
  • 34. Accuracies of 8 base-level learning algorithms to Statlog data sets Stacking Base-level algorithms Accuracy(%) CAMLET Base-level algorithms Accuracy(%)
  • 35. C4.5 Decision Tree without pruning (reproduction of CAMLET) Generating training and validation data sets with a void validation set Generating a classifier set (decision tree) with entropy + information ratio Evaluating a classifier set with set evaluation Evaluating classifier sets to test data set with single classifier set evaluation 1. Select just one control structure 2. Fill each method with specific methods 3. Instantiate this spec. 4. Compile the spec. 5. Execute its executable code 6. Refine the spec. (if needed)

Editor's Notes

  1. Good morning, I’m Hidenao Abe from Shizuoka University. Today I present an evaluation of our proposal, comparing its performance with that of three stacking methods. Our tool has implemented on a parallel environment to search a huge space faster. In our work, inductive application means an adequate learning algorithm to given data set. Its criteria is also given by a user. “ Data set” in our work indicates relational data set, and treated task is classification task. Now, let get into contents of today’s my talk.
  2. First of all, I introduce general overview of meta-learning scheme, and I’ll point out a problem of ordinary meta-learning scheme. Next, I’ll explain our proposal scheme called “Constructive meta-learning”, and it’s parallel execution environment. After implemented our tool called CAMLET, we have done a case study with two kinds of data sets. Finally, I’ll conclude today’s talk.
  3. This slide shows general overview of meta-learning scheme. We can use meta-learning algorithms to get a better result to the input data set than its done by a single learning algorithm. Meta-learning algorithm consists of “meta knowledge” and “its executor”. Ordinary meta-learning only integrate or combine results of base-level learning algorithms with meta-knowledge.
  4. So we call ordinary meta-learning algorithms “selective meta-learning”. This scheme consists of two approaches. One is integration of classifiers, which are learned with different training data sets generated by “Bootstrap Sampling” and weighting ill-classified instances. The former is called bagging. The latter is called boosting. The other is integration of base-level classifiers, which are learned with different algorithms. If the algorithm use simple voting, we call this voting. If the algorithm use a meta-level classifier learned by a meta-level data set, which built from results of base-level classifiers, we call these stacking or cascading. They works well to most data set. However, when no base-level learning algorithm works well to a given data set, they don’t work well at all. This problem comes why they don’t de-compose base-level learning algorithms. To overcome this problem, we have proposed another meta-learning scheme called “constructive meta-learning” scheme.
  5. I’ll explain about our constructive meta-learning based on method repositories. First, I’ll talk about basic idea of our constructive meta-learning. Second, I’ll present an implementation of a method repository and a tool for doing constructive meta-learning. This tool is called CAMLET. The search of CAMLET is very costly, because we can re-construct many inductive applications with the method repository. To search for the best inductive application faster, we have implemented a parallel execution protocol for CAMLET.
  6. This slide shows our basic idea of our constructive meta-learning. This scheme contains two major phases. The former phase is building a method repository. The latter phase is the re-construction of an adequate inductive application to a given data set. To build a method repository, we have analyzed representative learning algorithms such as C4.5, Version Space, Classifier Systems, AQ15 etc.
  7. With this analysis, we identify inductive learning methods, which are some functional parts of inductive learning algorithms. &gt;&gt; Click now!! The similar methods are built a hierarchy with defining input/output/reference and post-method/pre-method. With this method hierarchy, we need define control structures and data types. Control structures are needed for re-constructing inductive learning algorithms with inductive learning methods. And data types are needed for defining the hierarchy of methods. At this point in time, building of method repository and other information resources are completed. &gt;&gt; Click Now!! With this method repository, we can re-construct much more inductive applications than analyzed learning algorithms. The system search these new inductive application including analyzed learning algorithm for the adequate one.
  8. To implement a method repository, we have analyzed these 8 learning algorithms. From this analysis, inductive learning methods have been identified. ----- For example,
  9. This hierarchy is our implemented method repository. Six generic methods are put in top level of the hierarchy. Each generic method is brunch down by its input/output/reference data types, post-method/pre-method and difference of its function. The leaf nodes get into the executable code written by C language.
  10. Data type hierarchy is a organization of data types for the inductive learning methods.
  11. This slide shows us the generic six methods. They are “generating training and validation data sets”, “generating a classifier set”, “evaluating a classifier set”, “modifying training and validation data sets”, “modifying a classifier set” and “evaluating classifier set to test data set”. To re-construct learning algorithms, we have defined 8 control structure with 3 feed back loops to these generic methods. The ストレート path is necessary to all of the control structures.
  12. With the method repository, CAMLET constructs inductive applications to given data set from a user. CAMLET consists of these activities. In construction activity, CAMLET selects just one control structure and fill each generic method with specific one, referring the method repository and the data type hierarchy. Thus a basic specification of inductive application appears. In instantiation, CAMLET make up a specific specification, checking connections of input/output of each methods. Then the specific specification is complied and executed. If the performance of the inductive application go to the goal accuracy, CAMLET output it. If not so, CAMLET refines it with some strategy. This refinement task is assumed one kind of search task on the specification space of inductive application.
  13. CAMLET searches many inductive applications yield by the method repository for the satisfactory one to the given data set. So it needs very large computational cost. We have designed parallel execution of inductive applications. We have separated activities of CAMLET into two parts. One is composition level process. The other is execution level process. To parallelize executions of inductive applications, we have parallelized execution level process.
  14. This slide shows the parallel executions of inductive applications for CAMLET. At initializing phase, composition level process distributes constructed specifications and given data sets. Then execution level processing elements execute inductive applications. When one execution level processing element finishes its execution, it send the execution result. If the result can’t go to the goal, composition level process refines the specification of inductive application, and send refined specification to the execution level processing element. Execution phase to Refining&amp;Sending phase will be repeated until one inductive application going to the goal. We have implemented this execution protocol on the rack mounted parallel machine.
  15. With the parallel execution, we have also implemented search method with GA operations. This figure shows one refinement phase of this GA search. In this model, t+1 generation contains whole t generation. The GA operations execute as these order. First, executed specifications transform to chromosomes. Second, parents are selected by “Tournament Method”. Third, children are generated with random single point crossover and mutation. Then children are transformed to specifications, and executed. Repeating these operations, CAMLET refines specifications of inductive applications.
  16. With the GA search, the complete system overview of CAMLET as shown in this figure.
  17. After implemented the parallel CAMLET, we have done the case study to compare the accuracies of inductive applications composed by CAMLET with stacking methods to two kinds of common data sets. “Accuracy” is also called success rate. Its definition is a percentage of number of correctly predicted instances divided by the number of whole instances in predicted data set. At the same time, we have analyzed parallel efficiencies of parallel execution of inductive applications in CAMLET.
  18. To compare the performances, we have taken 10 Statlog data sets and 32 UCI data sets. We have input goal accuracies to each data set achieved by a stacking method. Then CAMLET has output just one inductive application to each data set. We have compare the accuracy of these inductive applications. To execute inductive applications in parallel, we have taken 10 processing elements.
  19. This table shows us summary of statlog common data sets. Accuracies of four data sets have been evaluated assigned test data set. To the other data sets, we have evaluated them with 10-fold cross validation. These 10-fold cross validation data set pairs generated with SplitDatasetFilter class of Weka.
  20. We have set up three stacking methods to Statlog common data set as shown in this figure. Their base-level learning algorithms contains 6 different algorithms. However, 8 base-level classifiers have been used, because we have set up different iterations to bagging and boosting. Parameters to each algorithm are written in brackets.
  21. This figure shows us the result of three stacking methods and its of inductive applications composed by CAMLET. The solid blue line with X marked shows accuracies of inductive applications by CAMLET. The average accuracies show no significant difference between the accuracies of stacking via linear regression and them of inductive application composed by CAMLET. With this result, CAMLET has as good performance as stacking methods.
  22. We also evaluated stacking methods and CAMLET with UCI common data sets. This slide shows settings of three stacking methods to UCI common data sets. In this evaluation, each meta-level algorithm has been able to use 6 base-level classifiers with 6 base-level algorithms.
  23. This slide is the evaluation with 32 UCI common data sets. The solid green line with circle marked shows accuracies of inductive application composed by CAMLET to each data set to satisfy the max accuracy of three stacking methods. The average of CAMLET is almost same the goal, comparing the other stacking methods.
  24. With the execution result to Statlog data sets, we have analyzed the parallel efficiency of CAMLET. This parallel efficiency means how much efficient than if we didn’t use this parallel execution.
  25. This slide shows each parallel efficiency of Statlog data set. To 10-fold cross validated data sets, parallel efficiencies are satisfactory. Besides, their of training-test data sets are not enough. Especially, the parallel efficiency of Satimage data set is low.
  26. Now, I conclude my talk.
  27. As conclusion, we have implemented a tool for doing constructive meta-learning. This tool shows us as good performance as stacking methods. Thus CAMLET have shown a significant performance without re-constructing the stacking methods. Then we have analyzed parallel efficiencies of parallel execution in CAMLET. It has achieved 5.93 times efficiency. As future work, we are…