SlideShare a Scribd company logo
1 of 18
How to gain a foothold in the
world of classification
Torsten Schön
dotplot GmbH
Overview
•
•
•
•
•

What is classification?
Workflow
Preprocessing
Basic classifiers
Evaluation

27.02.2014

How to gain a foothold in the world of classification

2
What is classification?
• Prediction model
• Supervised learning
• A set of historical data is available with known
class values
• Task: Predict to which class/category a new
unseen item belongs

27.02.2014

How to gain a foothold in the world of classification

3
What is classification?
• Terminology:
• Dataset: complete data measures
• Attributes/Features: Parameters measured for
each instance (usually columns)
• Instance: A single item for which parameters
are measured (usually rows)

27.02.2014

How to gain a foothold in the world of classification

4
What is classification?
Example:
• A set of blood parameters is measured from
50 cancer patients and from 50 control
persons
• 2-class problem: Cancer vs. Healthy
• To test if a new patient has cancer, the same
blood parameters are measured and
classification is used to predict the class
27.02.2014

How to gain a foothold in the world of classification

5
General Workflow
Training Data
Class values are known
Classification
Model

Predicted class
values

Test Data
Unknown class

27.02.2014

How to gain a foothold in the world of classification

6
Detailed Workflow
Training Data

Preprocessing

- Feature selection
- Feature engineering
- Impute missing values
…

Test Data

27.02.2014

Preprocessing

Model selection

Classification
Model

How to gain a foothold in the world of classification

Cross-Validation
Accuracy
ROC
…

Predicted
class values

7
Preprocessing
Feature Selection
• Select discriminant features only
• Save execution time
• Remove noise effects
• 2 Kind of methods:
– Ranking
– Subset evaluation

27.02.2014

How to gain a foothold in the world of classification

8
Preprocessing
Ranking (Filters)
• Features are ranked by a score
– Correlation
– Information gain
–…

• Number of selected features must be given
manually

27.02.2014

How to gain a foothold in the world of classification

9
Preprocessing
Subset Evaluation (Filter)
• A search algorithm is used to find best
features
• Number of selected features is determined by
the algorithm
Subset Evaluation (Wrapper)
• A model is learned and evaluated on the
subset to find best features
27.02.2014

How to gain a foothold in the world of classification

10
Preprocessing
Feature Engineering
• Transform or compute features to better
match requirements
• Text analysis: A plain text field cannot be used
for classification
• Extract key words as nominal features, count
number of word, letters …
• Start and end time  duration
27.02.2014

How to gain a foothold in the world of classification

11
Preprocessing
Estimate Missing Values
• Some algorithms require complete datasets
• Missing values need to be imputed
• Simplest: Mean and mode
• More advanced techniques lead to better
results
(own scientific field)

27.02.2014

How to gain a foothold in the world of classification

12
Preprocessing
Add Noise
• Generalization of the
algorithm is most
important!
• Adding artificial noise to
the training data can
lead the model to
generalize more
27.02.2014

How to gain a foothold in the world of classification

13
Classification Algorithms
• There are many different classification models
• Important:
– Generalization
– Robustness to noise
– Speed
– Performance
–…

• “No free lunch” Theorem
27.02.2014

How to gain a foothold in the world of classification

14
Classification Algorithms
k-Nearest Neighbors
• Selects the k closest
instances from the
training set
• Similarity measure
needed

27.02.2014

How to gain a foothold in the world of classification

15
Classification Algorithms
Support Vector Machine (SVM)
• Learns support vectors
which separate training
instances
• Can be
– Higher dimensions
– Non-linear
– multiple
27.02.2014

How to gain a foothold in the world of classification

16
Classification Algorithms
Random Forest
• Learns a “forest” of decision trees of randomly
different structures
• Majority of the votes of single trees is final
result
• Works well in many areas as it is very robust
to noise and against over fitting

27.02.2014

How to gain a foothold in the world of classification

17
Evaluation
• Evaluate different models and preprocessing
steps by comparing model performance
• Use only the training set for evaluation
• Often used: Cross-Validation
– Split the training data into k parts of equal size
– Use each part once as test set and remaining k-1
parts as training sets.
– Average the results
27.02.2014

How to gain a foothold in the world of classification

18

More Related Content

Similar to How to gain a foothold in the world of classification

An introduction to variable and feature selection
An introduction to variable and feature selectionAn introduction to variable and feature selection
An introduction to variable and feature selectionMarco Meoni
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - RecommendationCataldo Musto
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systemsEyob Sisay
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.pptKingSh2
 
Developing a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGISDeveloping a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGISCOGS Presentations
 
Ignacio panach ormeño et-al_caise2013
Ignacio panach   ormeño et-al_caise2013Ignacio panach   ormeño et-al_caise2013
Ignacio panach ormeño et-al_caise2013caise2013vlc
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Cataldo Musto
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 

Similar to How to gain a foothold in the world of classification (20)

Feature Selection.pdf
Feature Selection.pdfFeature Selection.pdf
Feature Selection.pdf
 
An introduction to variable and feature selection
An introduction to variable and feature selectionAn introduction to variable and feature selection
An introduction to variable and feature selection
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Evaluation techniques
Evaluation techniquesEvaluation techniques
Evaluation techniques
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
Lec 4 expert systems
Lec 4  expert systemsLec 4  expert systems
Lec 4 expert systems
 
e3-chap-09.ppt
e3-chap-09.ppte3-chap-09.ppt
e3-chap-09.ppt
 
Developing a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGISDeveloping a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGIS
 
TESTING
TESTINGTESTING
TESTING
 
Ignacio panach ormeño et-al_caise2013
Ignacio panach   ormeño et-al_caise2013Ignacio panach   ormeño et-al_caise2013
Ignacio panach ormeño et-al_caise2013
 
Nbvtalkonfeatureselection
NbvtalkonfeatureselectionNbvtalkonfeatureselection
Nbvtalkonfeatureselection
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Human Computer Interaction Evaluation
Human Computer Interaction EvaluationHuman Computer Interaction Evaluation
Human Computer Interaction Evaluation
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Chapter 8 eval. tech. lesson 1
Chapter 8 eval. tech. lesson 1 Chapter 8 eval. tech. lesson 1
Chapter 8 eval. tech. lesson 1
 
NLTestDag_20161118-B
NLTestDag_20161118-BNLTestDag_20161118-B
NLTestDag_20161118-B
 
Rapid Miner
Rapid MinerRapid Miner
Rapid Miner
 

Recently uploaded

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 

Recently uploaded (20)

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 

How to gain a foothold in the world of classification

  • 1. How to gain a foothold in the world of classification Torsten Schön dotplot GmbH
  • 2. Overview • • • • • What is classification? Workflow Preprocessing Basic classifiers Evaluation 27.02.2014 How to gain a foothold in the world of classification 2
  • 3. What is classification? • Prediction model • Supervised learning • A set of historical data is available with known class values • Task: Predict to which class/category a new unseen item belongs 27.02.2014 How to gain a foothold in the world of classification 3
  • 4. What is classification? • Terminology: • Dataset: complete data measures • Attributes/Features: Parameters measured for each instance (usually columns) • Instance: A single item for which parameters are measured (usually rows) 27.02.2014 How to gain a foothold in the world of classification 4
  • 5. What is classification? Example: • A set of blood parameters is measured from 50 cancer patients and from 50 control persons • 2-class problem: Cancer vs. Healthy • To test if a new patient has cancer, the same blood parameters are measured and classification is used to predict the class 27.02.2014 How to gain a foothold in the world of classification 5
  • 6. General Workflow Training Data Class values are known Classification Model Predicted class values Test Data Unknown class 27.02.2014 How to gain a foothold in the world of classification 6
  • 7. Detailed Workflow Training Data Preprocessing - Feature selection - Feature engineering - Impute missing values … Test Data 27.02.2014 Preprocessing Model selection Classification Model How to gain a foothold in the world of classification Cross-Validation Accuracy ROC … Predicted class values 7
  • 8. Preprocessing Feature Selection • Select discriminant features only • Save execution time • Remove noise effects • 2 Kind of methods: – Ranking – Subset evaluation 27.02.2014 How to gain a foothold in the world of classification 8
  • 9. Preprocessing Ranking (Filters) • Features are ranked by a score – Correlation – Information gain –… • Number of selected features must be given manually 27.02.2014 How to gain a foothold in the world of classification 9
  • 10. Preprocessing Subset Evaluation (Filter) • A search algorithm is used to find best features • Number of selected features is determined by the algorithm Subset Evaluation (Wrapper) • A model is learned and evaluated on the subset to find best features 27.02.2014 How to gain a foothold in the world of classification 10
  • 11. Preprocessing Feature Engineering • Transform or compute features to better match requirements • Text analysis: A plain text field cannot be used for classification • Extract key words as nominal features, count number of word, letters … • Start and end time  duration 27.02.2014 How to gain a foothold in the world of classification 11
  • 12. Preprocessing Estimate Missing Values • Some algorithms require complete datasets • Missing values need to be imputed • Simplest: Mean and mode • More advanced techniques lead to better results (own scientific field) 27.02.2014 How to gain a foothold in the world of classification 12
  • 13. Preprocessing Add Noise • Generalization of the algorithm is most important! • Adding artificial noise to the training data can lead the model to generalize more 27.02.2014 How to gain a foothold in the world of classification 13
  • 14. Classification Algorithms • There are many different classification models • Important: – Generalization – Robustness to noise – Speed – Performance –… • “No free lunch” Theorem 27.02.2014 How to gain a foothold in the world of classification 14
  • 15. Classification Algorithms k-Nearest Neighbors • Selects the k closest instances from the training set • Similarity measure needed 27.02.2014 How to gain a foothold in the world of classification 15
  • 16. Classification Algorithms Support Vector Machine (SVM) • Learns support vectors which separate training instances • Can be – Higher dimensions – Non-linear – multiple 27.02.2014 How to gain a foothold in the world of classification 16
  • 17. Classification Algorithms Random Forest • Learns a “forest” of decision trees of randomly different structures • Majority of the votes of single trees is final result • Works well in many areas as it is very robust to noise and against over fitting 27.02.2014 How to gain a foothold in the world of classification 17
  • 18. Evaluation • Evaluate different models and preprocessing steps by comparing model performance • Use only the training set for evaluation • Often used: Cross-Validation – Split the training data into k parts of equal size – Use each part once as test set and remaining k-1 parts as training sets. – Average the results 27.02.2014 How to gain a foothold in the world of classification 18