SlideShare a Scribd company logo
1 of 50
Pattern
Classification
Prepared By:
Ranjan Ganguli
Master Of Engineering
UIT, Burdwan
Contents
• Introduction
• Pattern Recognition Models
• Pattern Recognition Algorithms
* Classification
• Clustering Algorithm
Pattern Classification……By Ranjan Ganguli
Pattern Classification……By Ranjan Ganguli
3
What is a Pattern Recognition?
• The study of how machines can observe the
environment,
• learn to distinguish patterns of interest from
their background, and
• make sound and reasonable decisions about
the categories of the patterns.
• What is a pattern?
• What kinds of category we have?
What is a Pattern?
• “A pattern is essentially an arrangement” ………..(Definition 1)
• “It can also be defined by the common denominator among the
multiple instances of an entity”.
e.g., commonality in all fingerprint images defines the
fingerprint pattern; thus, a pattern could be a fingerprint
image, a handwritten cursive word, a human face, a
speech signal ……………………………...(Definition II)
•For example, a pattern could be
• A fingerprint images
• A handwritten cursive word
• A human face
A speech signal …..etc Pattern Classification……By Ranjan Ganguli
What is Pattern Category?
• It is a collection of similar, not necessarily
identical objects. Often, individual patterns may
be grouped into a category based on their
common properties; the resultant is also a
pattern and is often called a pattern category.
Pattern Classification……By Ranjan Ganguli
Pattern Recognition System
•The design model of a pattern recognition
system essentially involves the following three
steps:
I. Data acquisition and pre-processing
II. Feature extraction
III. Decision making
Pattern Classification……By Ranjan Ganguli
Block diagram of a pattern recognition system:
7
Pattern Classification……By Ranjan Ganguli
Pattern Recognition Models
•The four best known approaches
• template matching
• statistical classification
• syntactic or structural matching
• neural networks
Pattern Classification……By Ranjan Ganguli
Important characteristics of the pattern
recognition models.
Pattern Classification……By Ranjan Ganguli
PATTERN RECOGNITION ALGORITHMS
•The design pattern of algorithms consists of three
basic elements, i.e., data perception, feature
extraction and classification.
•Algorithms for pattern recognition depend on the type of
label output, on whether learning is supervised or
unsupervised
Pattern Classification……By Ranjan Ganguli
What is a Supervised Learning?
•In supervised learning, there is a teacher
who provides a category label or cost for
each pattern in the training set which is used
as a classifier.
• So basically a supervised learning method
is used for classification purpose.
Pattern Classification……By Ranjan Ganguli
• In the given figure, the input image consist of
mixture of two alphabets, i.e., A and B. Then the
classification algorithm classifies the input to two
different categories
Here a set of combined input is classified using
supervised learning approach.
Pattern Classification……By Ranjan Ganguli
What is Unsupervised learning?
The system forms clusters or “natural
groupings” of the input patterns
Pattern Classification……By Ranjan Ganguli
•Here the input consists of some unlabeled values
whose distinguishing feature is initially not known.
The following input consists of such a combination
with all values technically same but still its clusters
are formed using some metric which is different for
each algorithm
Pattern Classification……By Ranjan Ganguli
15
An Example
• “Sorting incoming Fish on a conveyor according to
species using optical sensing”
Sea bass
Species
Salmon
Pattern Classification……By Ranjan Ganguli
16
• Problem Analysis
• Set up a camera and take some sample images to extract
features
• Length
• Lightness
• Width
• Number and shape of fins
• Position of the mouth, etc…
• This is the set of all suggested features to explore for use in our
classifier!
Pattern Classification……By Ranjan Ganguli
17
• Preprocessing
• Use a segmentation operation to isolate fishes from one
another and from the background
• Information from a single fish is sent to a feature
extractor whose purpose is to reduce the data by
measuring certain features
• The features are passed to a classifier
Pattern Classification……By Ranjan Ganguli
18
Pattern Classification……By Ranjan Ganguli
19
• Classification
• Select the length of the fish as a possible feature for
discrimination
Pattern Classification……By Ranjan Ganguli
20
Pattern Classification……By Ranjan Ganguli
21
The length is a poor feature alone!
Select the lightness as a possible feature.
Pattern Classification……By Ranjan Ganguli
22
Pattern Classification……By Ranjan Ganguli
23
• Threshold decision boundary and cost relationship
• Move our decision boundary toward smaller values of
lightness in order to minimize the cost (reduce the number
of sea bass that are classified salmon!)
Task of decision theory
Pattern Classification……By Ranjan Ganguli
24
• Adopt the lightness and add the width of the fish
Fish xT
= [x1, x2]
Lightness Width
Pattern Classification……By Ranjan Ganguli
25
Pattern Classification……By Ranjan Ganguli
26
• We might add other features that are not correlated
with the ones we already have. A precaution should
be taken not to reduce the performance by adding
such “noisy features”
• Ideally, the best decision boundary should be the one
which provides an optimal performance such as in the
following figure:
Pattern Classification……By Ranjan Ganguli
27
Pattern Classification……By Ranjan Ganguli
28
• However, our satisfaction is premature because
the central aim of designing a classifier is to
correctly classify novel input
Issue of generalization!
Pattern Classification……By Ranjan Ganguli
29
Pattern Classification……By Ranjan Ganguli
30
Pattern Classification……By Ranjan Ganguli
CLASSIFICATION ALGORITHMS
(Supervised Learning)
•Decision trees
•Kernel Estimation & K-nearest neighbour(KNn)
•Linear discriminate analysis (LDA)
• Quadratic Discriminate Analysis (QDA)
•Maximum entropy classifier (multinomial logistic regression)
•Naive Bayes classifier
•Artificial Neural Networks
•Support Vector Machine
Pattern Classification……By Ranjan Ganguli
Decision Trees
“Splitting datasets one feature at a time”
The decision tree is one of the most commonly used
classification techniques; recent surveys claim that it’s
the most commonly used technique.
Advantages:
“Major focus on insights about the data”.
Decision tree–building algorithm use
information theory to split the data-set
based on some decisions
1. To build a decision tree, we need to make a
first decision on the dataset to dictate which feature is
used to split the data.
2. To determine this, we try every feature and measure
which split will give you the best results.
3. After that, we’ll split the dataset into subsets.
4. The subsets will then traverse down the branches of
the first decision node. If the data on the branches is
the same class, then you’ve properly classified it and
Steps:
5. If the data isn’t the same, then we need
to repeat the splitting process on this
subset. The decision on how to split this
subset is done the same way as
the original dataset, and we repeat this
process until we’ve classified all the data.
Information gain
•We choose to split our dataset in a way
that makes our unorganized data more
organized. One way to organize this is to
measure the information.
•Using information theory, we can measure
the information before and after the split
•The change in information before and after
the split is known as the information
gain.
Note:
Highest information gain helps to
split the data set
The attribute with the highest
information gain is chosen as the
splitting
Information gain = Entropy
What is entropy?
Entropy is defined as the expected value
of the information.
(Here, it is measured on each attribute)
For entropy to calculate, we need the
expected value of all the information
of all possible values of our class.
This is given by:
Example to calculate information gain
•Next, we need to calculate expected
information gain for each attribute
CLUSTERING ALGORITHMS
(Un-supervised Learning)
•Hierarchical Clustering
•K-means Clustering
•KPCA (Kernel Principle Component Analysis)
Pattern Classification……By Ranjan Ganguli
Questions?

More Related Content

What's hot

Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised LearningLukas Tencer
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its ApplicationsSajida Mohammad
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reductionMarco Quartulli
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And TechniquesSupervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And TechniquesSlideTeam
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine LearningAnkit Rai
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine LearningSamra Shahzadi
 
BTech Pattern Recognition Notes
BTech Pattern Recognition NotesBTech Pattern Recognition Notes
BTech Pattern Recognition NotesAshutosh Agrahari
 
Feature selection
Feature selectionFeature selection
Feature selectiondkpawar
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentationRishavSharma112
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionalityNikhil Sharma
 

What's hot (20)

Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its Applications
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
Supervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And TechniquesSupervised Machine Learning With Types And Techniques
Supervised Machine Learning With Types And Techniques
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
BTech Pattern Recognition Notes
BTech Pattern Recognition NotesBTech Pattern Recognition Notes
BTech Pattern Recognition Notes
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 

Viewers also liked

What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?ESRI Bulgaria
 
Bayseian decision theory
Bayseian decision theoryBayseian decision theory
Bayseian decision theorysia16
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
patient classification system,staffing
patient classification system,staffingpatient classification system,staffing
patient classification system,staffingAHMED ZINHOM
 

Viewers also liked (8)

Pattern classification
Pattern classificationPattern classification
Pattern classification
 
What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?
 
Bayseian decision theory
Bayseian decision theoryBayseian decision theory
Bayseian decision theory
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
patient classification system,staffing
patient classification system,staffingpatient classification system,staffing
patient classification system,staffing
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern Recognition
 

Similar to pattern classification

Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxDr.Shweta
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised LearningSara Hooker
 
pattern recognition.ppt
pattern recognition.pptpattern recognition.ppt
pattern recognition.pptSowmiyaBaskar4
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxinfosec train
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxInfosectrain3
 
DISCRIMINANT ANALYSIS.pptx
DISCRIMINANT ANALYSIS.pptxDISCRIMINANT ANALYSIS.pptx
DISCRIMINANT ANALYSIS.pptxAnup597384
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptxhiblooms
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptxNIKHILGR3
 
01 Statistika Lanjut - Cluster Analysis part 1 with sound (1).pptx
01 Statistika Lanjut - Cluster Analysis  part 1 with sound (1).pptx01 Statistika Lanjut - Cluster Analysis  part 1 with sound (1).pptx
01 Statistika Lanjut - Cluster Analysis part 1 with sound (1).pptxniawiya
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updatedVajira Thambawita
 
Basics of Clustering
Basics of ClusteringBasics of Clustering
Basics of ClusteringB. Nichols
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfAlphaIssaghaDiallo
 

Similar to pattern classification (20)

Unit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptxUnit 2 unsupervised learning.pptx
Unit 2 unsupervised learning.pptx
 
Classification.pptx
Classification.pptxClassification.pptx
Classification.pptx
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised Learning
 
pattern recognition.ppt
pattern recognition.pptpattern recognition.ppt
pattern recognition.ppt
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
 
DISCRIMINANT ANALYSIS.pptx
DISCRIMINANT ANALYSIS.pptxDISCRIMINANT ANALYSIS.pptx
DISCRIMINANT ANALYSIS.pptx
 
Data mining
Data miningData mining
Data mining
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Lecture 3 ml
Lecture 3 mlLecture 3 ml
Lecture 3 ml
 
01 Statistika Lanjut - Cluster Analysis part 1 with sound (1).pptx
01 Statistika Lanjut - Cluster Analysis  part 1 with sound (1).pptx01 Statistika Lanjut - Cluster Analysis  part 1 with sound (1).pptx
01 Statistika Lanjut - Cluster Analysis part 1 with sound (1).pptx
 
seminar.pptx
seminar.pptxseminar.pptx
seminar.pptx
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Basics of Clustering
Basics of ClusteringBasics of Clustering
Basics of Clustering
 
PPT s09-machine vision-s2
PPT s09-machine vision-s2PPT s09-machine vision-s2
PPT s09-machine vision-s2
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdf
 

pattern classification

  • 2. Contents • Introduction • Pattern Recognition Models • Pattern Recognition Algorithms * Classification • Clustering Algorithm Pattern Classification……By Ranjan Ganguli
  • 3. Pattern Classification……By Ranjan Ganguli 3 What is a Pattern Recognition? • The study of how machines can observe the environment, • learn to distinguish patterns of interest from their background, and • make sound and reasonable decisions about the categories of the patterns. • What is a pattern? • What kinds of category we have?
  • 4. What is a Pattern? • “A pattern is essentially an arrangement” ………..(Definition 1) • “It can also be defined by the common denominator among the multiple instances of an entity”. e.g., commonality in all fingerprint images defines the fingerprint pattern; thus, a pattern could be a fingerprint image, a handwritten cursive word, a human face, a speech signal ……………………………...(Definition II) •For example, a pattern could be • A fingerprint images • A handwritten cursive word • A human face A speech signal …..etc Pattern Classification……By Ranjan Ganguli
  • 5. What is Pattern Category? • It is a collection of similar, not necessarily identical objects. Often, individual patterns may be grouped into a category based on their common properties; the resultant is also a pattern and is often called a pattern category. Pattern Classification……By Ranjan Ganguli
  • 6. Pattern Recognition System •The design model of a pattern recognition system essentially involves the following three steps: I. Data acquisition and pre-processing II. Feature extraction III. Decision making Pattern Classification……By Ranjan Ganguli
  • 7. Block diagram of a pattern recognition system: 7 Pattern Classification……By Ranjan Ganguli
  • 8. Pattern Recognition Models •The four best known approaches • template matching • statistical classification • syntactic or structural matching • neural networks Pattern Classification……By Ranjan Ganguli
  • 9. Important characteristics of the pattern recognition models. Pattern Classification……By Ranjan Ganguli
  • 10. PATTERN RECOGNITION ALGORITHMS •The design pattern of algorithms consists of three basic elements, i.e., data perception, feature extraction and classification. •Algorithms for pattern recognition depend on the type of label output, on whether learning is supervised or unsupervised Pattern Classification……By Ranjan Ganguli
  • 11. What is a Supervised Learning? •In supervised learning, there is a teacher who provides a category label or cost for each pattern in the training set which is used as a classifier. • So basically a supervised learning method is used for classification purpose. Pattern Classification……By Ranjan Ganguli
  • 12. • In the given figure, the input image consist of mixture of two alphabets, i.e., A and B. Then the classification algorithm classifies the input to two different categories Here a set of combined input is classified using supervised learning approach. Pattern Classification……By Ranjan Ganguli
  • 13. What is Unsupervised learning? The system forms clusters or “natural groupings” of the input patterns Pattern Classification……By Ranjan Ganguli
  • 14. •Here the input consists of some unlabeled values whose distinguishing feature is initially not known. The following input consists of such a combination with all values technically same but still its clusters are formed using some metric which is different for each algorithm Pattern Classification……By Ranjan Ganguli
  • 15. 15 An Example • “Sorting incoming Fish on a conveyor according to species using optical sensing” Sea bass Species Salmon Pattern Classification……By Ranjan Ganguli
  • 16. 16 • Problem Analysis • Set up a camera and take some sample images to extract features • Length • Lightness • Width • Number and shape of fins • Position of the mouth, etc… • This is the set of all suggested features to explore for use in our classifier! Pattern Classification……By Ranjan Ganguli
  • 17. 17 • Preprocessing • Use a segmentation operation to isolate fishes from one another and from the background • Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features • The features are passed to a classifier Pattern Classification……By Ranjan Ganguli
  • 19. 19 • Classification • Select the length of the fish as a possible feature for discrimination Pattern Classification……By Ranjan Ganguli
  • 21. 21 The length is a poor feature alone! Select the lightness as a possible feature. Pattern Classification……By Ranjan Ganguli
  • 23. 23 • Threshold decision boundary and cost relationship • Move our decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!) Task of decision theory Pattern Classification……By Ranjan Ganguli
  • 24. 24 • Adopt the lightness and add the width of the fish Fish xT = [x1, x2] Lightness Width Pattern Classification……By Ranjan Ganguli
  • 26. 26 • We might add other features that are not correlated with the ones we already have. A precaution should be taken not to reduce the performance by adding such “noisy features” • Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure: Pattern Classification……By Ranjan Ganguli
  • 28. 28 • However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input Issue of generalization! Pattern Classification……By Ranjan Ganguli
  • 31. CLASSIFICATION ALGORITHMS (Supervised Learning) •Decision trees •Kernel Estimation & K-nearest neighbour(KNn) •Linear discriminate analysis (LDA) • Quadratic Discriminate Analysis (QDA) •Maximum entropy classifier (multinomial logistic regression) •Naive Bayes classifier •Artificial Neural Networks •Support Vector Machine Pattern Classification……By Ranjan Ganguli
  • 32. Decision Trees “Splitting datasets one feature at a time” The decision tree is one of the most commonly used classification techniques; recent surveys claim that it’s the most commonly used technique.
  • 33.
  • 34. Advantages: “Major focus on insights about the data”.
  • 35. Decision tree–building algorithm use information theory to split the data-set based on some decisions
  • 36. 1. To build a decision tree, we need to make a first decision on the dataset to dictate which feature is used to split the data. 2. To determine this, we try every feature and measure which split will give you the best results. 3. After that, we’ll split the dataset into subsets. 4. The subsets will then traverse down the branches of the first decision node. If the data on the branches is the same class, then you’ve properly classified it and Steps:
  • 37. 5. If the data isn’t the same, then we need to repeat the splitting process on this subset. The decision on how to split this subset is done the same way as the original dataset, and we repeat this process until we’ve classified all the data.
  • 38. Information gain •We choose to split our dataset in a way that makes our unorganized data more organized. One way to organize this is to measure the information. •Using information theory, we can measure the information before and after the split •The change in information before and after the split is known as the information gain.
  • 39. Note: Highest information gain helps to split the data set The attribute with the highest information gain is chosen as the splitting
  • 40. Information gain = Entropy What is entropy? Entropy is defined as the expected value of the information. (Here, it is measured on each attribute)
  • 41. For entropy to calculate, we need the expected value of all the information of all possible values of our class. This is given by:
  • 42. Example to calculate information gain
  • 43.
  • 44. •Next, we need to calculate expected information gain for each attribute
  • 45.
  • 46.
  • 47.
  • 48.
  • 49. CLUSTERING ALGORITHMS (Un-supervised Learning) •Hierarchical Clustering •K-means Clustering •KPCA (Kernel Principle Component Analysis) Pattern Classification……By Ranjan Ganguli