SlideShare a Scribd company logo
1 of 35
Pattern Recognition
UNIT 5
BY:SURBHI SAROHA
SYLLABUS
ā€¢ Introduction
ā€¢ Design principles of pattern recognition system
ā€¢ Statistical Pattern recognition
ā€¢ Parameter estimation methods
ā€¢ Principle Component Analysis(PCA)
ā€¢ Linear Discriminant Analysis(LDA)
ā€¢ Classification Techniques
ā€¢ Nearest Neighbor(NN) Rule
Contā€¦
ā€¢ Bayes Classifier
ā€¢ Support Vector Machine(SVM)
ā€¢ K-means clustering
Pattern Recognition
ā€¢ Pattern recognition is the automated recognition of patterns and
regularities in data.
ā€¢ It has applications in statistical data analysis, signal processing, image
analysis, information retrieval, bioinformatics, data
compression, computer graphics and machine learning.
ā€¢ Pattern recognition has its origins in statistics and engineering; some
modern approaches to pattern recognition include the use of machine
learning, due to the increased availability of big data and a new
abundance of processing power.
ā€¢ However, these activities can be viewed as two facets of the same field of
application, and together they have undergone substantial development
over the past few decades.
ā€¢ The field of pattern recognition is concerned with the automatic discovery
of regularities in data through the use of computer algorithms and with
the use of these regularities to take actions such as classifying the data
into different categories.[1]
Design principles of pattern
recognition system
ā€¢ Pattern Recognition System
Pattern is everything around in this digital world.
ā€¢ A pattern can either be seen physically or it can
be observed mathematically by applying
algorithms.
ā€¢ In Pattern Recognition, pattern is comprises of
the following two fundamental things:
ā€¢ Collection of observations
ā€¢ The concept behind the observation
Contā€¦
ā€¢ Design Principles of Pattern Recognition
In pattern recognition system, for recognizing the pattern or structure two
basic approaches are used which can be implemented in diferrent
techniques. These are ā€“
ā€“ Statistical Approach and
ā€“ Structural Approach
ā€¢ Statistical Approach:
Statistical methods are mathematical formulas, models, and techniques
that are used in the statistical analysis of raw research data.
ā€¢ The application of statistical methods extracts information from research
data and provides different ways to assess the robustness of research
outputs.
ā€“ Two main statistical methods are used :Descriptive Statistics: It summarizes
data from a sample using indexes such as the mean or standard deviation.
ā€“ Inferential Statistics: It draw conclusions from data that are subject to random
variation.
Statistical Pattern recognition
ā€¢ Structural Approach:
The Structural Approach is a technique wherein
the learner masters the pattern of sentence.
Structures are the different arrangements of
words in one accepted style or the other.
ā€“ Types of structures:Sentence Patterns
ā€“ Phrase Patterns
ā€“ Formulas
ā€“ Idioms
Difference Between Statistical
Approach and Structural Approach:
Parameter estimation methods
ā€¢ The term parameter estimation refers to the process of
using sample data (in reliability engineering, usually
times-to-failure or success data) to estimate the
parameters of the selected distribution.
ā€¢ Several parameter estimation methods are available.
ā€¢ More specifically, we start with the relatively simple
method of Probability Plotting and continue with the
more sophisticated methods of Rank Regression (or
Least Squares), Maximum Likelihood Estimation and
Bayesian Estimation Methods.
Probability Plotting
ā€¢ The least mathematically intensive method for parameter estimation is
the method of probability plotting.
ā€¢ As the term implies, probability plotting involves a physical plot of the
data on specially constructed probability plotting paper. This method is
easily implemented by hand, given that one can obtain the appropriate
probability plotting paper.
ā€¢ The method of probability plotting takes the cdf of the distribution and
attempts to linearize it by employing a specially constructed paper. The
following sections illustrate the steps in this method using the 2-
parameter Weibull distribution as an example. This includes:
ā€¢ Linearize the unreliability function
ā€¢ Construct the probability plotting paper
ā€¢ Determine the X and Y positions of the plot points
ā€¢ And then using the plot to read any particular time or
reliability/unreliability value of interest.
Methods of Parameter Estimation
ā€¢ The techniques used for parameter estimation are called estimators.
ā€¢ Some estimators are:
ā€¢ Probability Plotting: A method of finding parameter values where the data
is plotted on special plotting paper and parameters are derived from the
visual plot
ā€¢ Rank Regression (Least Squares): A method of finding parameter values
that minimizes the sum of the squares of the residuals.
ā€¢ Maximum Likelihood Estimation: A method of finding parameter values
that, given a set of observations, will maximize the likelihood function.
ā€¢ Bayesian Estimation Methods: A family of estimation methods that tries
to minimize the posterior expectation of what is called the utility function.
In practice, what this means is that existing knowledge about a situation is
formulated, data is gathered, and then posterior knowledge is used to
update our beliefs.
Principle Component Analysis(PCA)
ā€¢ Principal component analysis (PCA) is a technique used to
emphasize variation and bring out strong patterns in a dataset.
ā€¢ It's often used to make data easy to explore and visualize.
ā€¢ 2D example
ā€¢ First, consider a dataset in only two dimensions, like (height,
weight).
ā€¢ This dataset can be plotted as points in a plane.
ā€¢ But if we want to tease out variation, PCA finds a new coordinate
system in which every point has a new (x,y) value.
ā€¢ The axes don't actually mean anything physical; they're
combinations of height and weight called "principal components"
that are chosen to give one axes lots of variation.
Contā€¦
ā€¢ 3D example
ā€¢ With three dimensions, PCA is more useful, because
it's hard to see through a cloud of data.
ā€¢ In the example below, the original data are plotted in
3D, but you can project the data into 2D through a
transformation no different than finding a camera
angle: rotate the axes to find the best angle.
ā€¢ To see the "official" PCA transformation, click the
"Show PCA" button.
ā€¢ The PCA transformation ensures that the horizontal
axis PC1 has the most variation, the vertical axis PC2
the second-most, and a third axis PC3 the least.
Linear Discriminant Analysis(LDA)
ā€¢ Linear discriminant analysis (LDA), normal discriminant analysis (NDA),
or discriminant function analysis is a generalization of Fisher's linear discriminant,
a method used in statistics and other fields, to find a linear combination of
features that characterizes or separates two or more classes of objects or events.
ā€¢ The resulting combination may be used as a linear classifier, or, more commonly,
for dimensionality reduction before later classification.
ā€¢ LDA is closely related to analysis of variance (ANOVA) and regression analysis,
which also attempt to express one dependent variable as a linear combination of
other features or measurements.
ā€¢ However, ANOVA uses categorical independent variables and
a continuous dependent variable, whereas discriminant analysis has
continuous independent variables and a categorical dependent variable (i.e. the
class label).
ā€¢ Logistic regression and probit regression are more similar to LDA than ANOVA is, as
they also explain a categorical variable by the values of continuous independent
variables. These other methods are preferable in applications where it is not
reasonable to assume that the independent variables are normally distributed,
which is a fundamental assumption of the LDA method.
Contā€¦
ā€¢ LDA is also closely related to principal component analysis (PCA)
and factor analysis in that they both look for linear combinations of
variables which best explain the data.
ā€¢ LDA explicitly attempts to model the difference between the classes of
data.
ā€¢ PCA, in contrast, does not take into account any difference in class, and
factor analysis builds the feature combinations based on differences
rather than similarities.
ā€¢ Discriminant analysis is also different from factor analysis in that it is not
an interdependence technique: a distinction between independent
variables and dependent variables (also called criterion variables) must be
made.
ā€¢ LDA works when the measurements made on independent variables for
each observation are continuous quantities.
ā€¢ When dealing with categorical independent variables, the equivalent
technique is discriminant correspondence analysis.
Classification Techniques
ā€¢ Various types of classification algorithms:
ā€¢ Logistic Regression
ā€¢ Naive Bayes Classifier
ā€¢ K-Nearest Neighbors
ā€¢ Decision Tree
ā€“ Random Forest
ā€¢ Support Vector Machines
Contā€¦
ā€¢ Logistic Regression
ā€¢ Logistic regression is a calculation used to predict
a binary outcome: either something happens, or
does not. This can be exhibited as Yes/No,
Pass/Fail, Alive/Dead, etc.
ā€¢ Naive Bayes Classifier
ā€¢ Naive Bayes calculates the possibility of whether
a data point belongs within a certain category or
does not. In text analysis, it can be used to
categorize words or phrases as belonging to a
preset ā€œtagā€ (classification) or not.
Contā€¦.
ā€¢ K-nearest Neighbors
ā€¢ K-nearest neighbors (k-NN) is a pattern
recognition algorithm that uses training datasets
to find the k closest relatives in future examples.
ā€¢ When k-NN is used in classification, you calculate
to place data within the category of its nearest
neighbor. If k = 1, then it would be placed in the
class nearest 1. K is classified by a plurality poll of
its neighbors.
Contā€¦.
ā€¢ Decision Tree
ā€¢ A decision tree is a supervised learning algorithm
that is perfect for classification problems, as itā€™s
able to order classes on a precise level.
ā€¢ It works like a flow chart, separating data points
into two similar categories at a time from the
ā€œtree trunkā€ to ā€œbranches,ā€ to ā€œleaves,ā€ where the
categories become more finitely similar.
ā€¢ This creates categories within categories,
allowing for organic classification with limited
human supervision.
Random Forest
ā€¢ The random forest algorithm is an expansion of
decision tree, in that, you first construct some-
axis real-world decision trees with training data,
then fit your new data within one of the trees as
a ā€œrandom forest.ā€
ā€¢ Support Vector Machines
ā€¢ A support vector machine (SVM) uses algorithms
to train and classify data within degrees of
polarity, taking it to a degree
beyond X/Y prediction.
Nearest Neighbor(NN) Rule
ā€¢ K-Nearest Neighbors is one of the most basic yet
essential classification algorithms in Machine Learning.
ā€¢ It belongs to the supervised learning domain and finds
intense application in pattern recognition, data mining
and intrusion detection.
ā€¢ It is widely disposable in real-life scenarios since it is
non-parametric, meaning, it does not make any
underlying assumptions about the distribution of data
(as opposed to other algorithms such as GMM, which
assume a Gaussian distribution of the given data).
Contā€¦
ā€¢ In statistics, the k-nearest neighbors algorithm (k-NN) is a non-
parametric machine learning method first developed by Evelyn
Fix and Joseph Hodges in 1951,and later expanded by Thomas
Cover.
ā€¢ It is used for classification and regression.
ā€¢ In both cases, the input consists of the k closest training examples
in feature space. The output depends on whether k-NN is used for
classification or regression:
ā€¢ In k-NN classification, the output is a class membership. An object is
classified by a plurality vote of its neighbors, with the object being
assigned to the class most common among its k nearest neighbors
(k is a positive integer, typically small). If k = 1, then the object is
simply assigned to the class of that single nearest neighbor.
ā€¢ In k-NN regression, the output is the property value for the object.
This value is the average of the values of k nearest neighbors.
Bayes Classifier
ā€¢ A Naive Bayes classifier is a probabilistic machine
learning model thatā€™s used for classification task.
ā€¢ The crux of the classifier is based on the Bayes
theorem.
ā€¢ Bayes Theorem:
ā€¢ Using Bayes theorem, we can find the probability
of A happening, given that B has occurred. Here, B is
the evidence and A is the hypothesis. The assumption
made here is that the predictors/features are
independent. That is presence of one particular feature
does not affect the other. Hence it is called naive.
Types of Naive Bayes Classifier:
ā€¢ Multinomial Naive Bayes:
ā€¢ This is mostly used for document classification problem, i.e whether
a document belongs to the category of sports, politics, technology
etc. The features/predictors used by the classifier are the frequency
of the words present in the document.
ā€¢ Bernoulli Naive Bayes:
ā€¢ This is similar to the multinomial naive bayes but the predictors are
boolean variables. The parameters that we use to predict the class
variable take up only values yes or no, for example if a word occurs
in the text or not.
ā€¢ Gaussian Naive Bayes:
ā€¢ When the predictors take up a continuous value and are not
discrete, we assume that these values are sampled from a gaussian
distribution.
Support Vector Machine(SVM)
ā€¢ In machine learning, support-vector machines (SVMs, also support-
vector networks) are supervised learning models with associated
learning algorithms that analyze data
for classification and regression analysis.
ā€¢ Developed at AT&T Bell Laboratories by Vapnik with colleagues
(Boser et al., 1992, Guyon et al., 1993, Vapnik et al., 1997), SVMs
are one of the most robust prediction methods, being based on
statistical learning frameworks or VC theory proposed by Vapnik
and Chervonenkis (1974) and Vapnik (1982, 1995).
ā€¢ Given a set of training examples, each marked as belonging to one
of two categories, an SVM training algorithm builds a model that
assigns new examples to one category or the other, making it a
non-probabilistic binary linear classifier (although methods such
as Platt scaling exist to use SVM in a probabilistic classification
setting).
Contā€¦
ā€¢ An SVM maps training examples to points in space so as to
maximise the width of the gap between the two categories.
ā€¢ Support vector machines (SVMs) are powerful yet flexible
supervised machine learning algorithms which are used
both for classification and regression.
ā€¢ But generally, they are used in classification problems.
ā€¢ In 1960s, SVMs were first introduced but later they got
refined in 1990.
ā€¢ SVMs have their unique way of implementation as
compared to other machine learning algorithms.
ā€¢ Lately, they are extremely popular because of their ability
to handle multiple continuous and categorical variables.
Working of SVM
ā€¢ An SVM model is basically a representation of
different classes in a hyperplane in
multidimensional space.
ā€¢ The hyperplane will be generated in an
iterative manner by SVM so that the error can
be minimized.
ā€¢ The goal of SVM is to divide the datasets into
classes to find a maximum marginal
hyperplane (MMH).
The followings are important concepts
in SVM āˆ’
ā€¢ Support Vectors āˆ’ Datapoints that are closest to the
hyperplane is called support vectors. Separating line
will be defined with the help of these data points.
ā€¢ Hyperplane āˆ’ As we can see in the above diagram, it is
a decision plane or space which is divided between a
set of objects having different classes.
ā€¢ Margin āˆ’ It may be defined as the gap between two
lines on the closet data points of different classes. It
can be calculated as the perpendicular distance from
the line to the support vectors. Large margin is
considered as a good margin and small margin is
considered as a bad margin.
K-means clustering
ā€¢ We are given a data set of items, with certain
features, and values for these features (like a
vector).
ā€¢ The task is to categorize those items into
groups.
ā€¢ To achieve this, we will use the kMeans
algorithm; an unsupervised learning
algorithm.
Contā€¦.
ā€¢ (It will help if you think of items as points in an n-
dimensional space). The algorithm will categorize the
items into k groups of similarity. To calculate that
similarity, we will use the euclidean distance as
measurement.
ā€¢ The algorithm works as follows:
ā€¢ First we initialize k points, called means, randomly.
ā€¢ We categorize each item to its closest mean and we
update the meanā€™s coordinates, which are the averages
of the items categorized in that mean so far.
ā€¢ We repeat the process for a given number of iterations
and at the end, we have our clusters.
The above algorithm in pseudocode:
ā€¢ Initialize k means with random values
ā€¢ For a given number of iterations:
ā€¢ Iterate through items:
ā€¢ Find the mean closest to the item
ā€¢ Assign item to mean
ā€¢ Update mean
ā€¢ Thank you ļŠ

More Related Content

What's hot

Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted treesNihar Ranjan
Ā 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
Ā 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithmswapnac12
Ā 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmMostafa G. M. Mostafa
Ā 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
Ā 
Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Rehan Guha
Ā 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
Ā 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
Ā 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
Ā 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classificationKrish_ver2
Ā 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
Ā 
Adaptive Resonance Theory
Adaptive Resonance TheoryAdaptive Resonance Theory
Adaptive Resonance TheoryNaveen Kumar
Ā 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
Ā 
Learning set of rules
Learning set of rulesLearning set of rules
Learning set of rulesswapnac12
Ā 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram ProcessingAmnaakhaan
Ā 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image CompressionMathankumar S
Ā 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
Ā 
Artificial Neural Networks 1
Artificial Neural Networks 1Artificial Neural Networks 1
Artificial Neural Networks 1swapnac12
Ā 

What's hot (20)

Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted trees
Ā 
Back propagation
Back propagationBack propagation
Back propagation
Ā 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithm
Ā 
Neural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) AlgorithmNeural Networks: Least Mean Square (LSM) Algorithm
Neural Networks: Least Mean Square (LSM) Algorithm
Ā 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Ā 
Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)Parametric & Non-Parametric Machine Learning (Supervised ML)
Parametric & Non-Parametric Machine Learning (Supervised ML)
Ā 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
Ā 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Ā 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
Ā 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
Ā 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Ā 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
Ā 
Adaptive Resonance Theory
Adaptive Resonance TheoryAdaptive Resonance Theory
Adaptive Resonance Theory
Ā 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
Ā 
Learning set of rules
Learning set of rulesLearning set of rules
Learning set of rules
Ā 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram Processing
Ā 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
Ā 
Naive bayes
Naive bayesNaive bayes
Naive bayes
Ā 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Ā 
Artificial Neural Networks 1
Artificial Neural Networks 1Artificial Neural Networks 1
Artificial Neural Networks 1
Ā 

Similar to Pattern recognition UNIT 5

PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisAashish Patel
Ā 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
Ā 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
Ā 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesIRJET Journal
Ā 
Machine learning meetup
Machine learning meetupMachine learning meetup
Machine learning meetupQuantUniversity
Ā 
Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Jayanti Pande
Ā 
M5.pptx
M5.pptxM5.pptx
M5.pptxMayuraD1
Ā 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptxKaviya452563
Ā 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESIRJET Journal
Ā 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
Ā 
Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind MapAshish Patel
Ā 
Data Reduction
Data ReductionData Reduction
Data ReductionRajan Shah
Ā 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
Ā 
IRJET- Evidence Chain for Missing Data Imputation: Survey
IRJET- Evidence Chain for Missing Data Imputation: SurveyIRJET- Evidence Chain for Missing Data Imputation: Survey
IRJET- Evidence Chain for Missing Data Imputation: SurveyIRJET Journal
Ā 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningQuantUniversity
Ā 

Similar to Pattern recognition UNIT 5 (20)

PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data Analysis
Ā 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
Ā 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
Ā 
pca.pdf
pca.pdfpca.pdf
pca.pdf
Ā 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction Techniques
Ā 
Machine learning meetup
Machine learning meetupMachine learning meetup
Machine learning meetup
Ā 
Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.
Ā 
M5.pptx
M5.pptxM5.pptx
M5.pptx
Ā 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Ā 
Big Data Analytics.pptx
Big Data Analytics.pptxBig Data Analytics.pptx
Big Data Analytics.pptx
Ā 
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGESCASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
CASE STUDY: ADMISSION PREDICTION IN ENGINEERING AND TECHNOLOGY COLLEGES
Ā 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
Ā 
Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind Map
Ā 
Data Reduction
Data ReductionData Reduction
Data Reduction
Ā 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
Ā 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Ā 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
Ā 
IRJET- Evidence Chain for Missing Data Imputation: Survey
IRJET- Evidence Chain for Missing Data Imputation: SurveyIRJET- Evidence Chain for Missing Data Imputation: Survey
IRJET- Evidence Chain for Missing Data Imputation: Survey
Ā 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Ā 
Supervised Learning.pptx
Supervised Learning.pptxSupervised Learning.pptx
Supervised Learning.pptx
Ā 

More from SURBHI SAROHA

Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2SURBHI SAROHA
Ā 
Management Information System(Unit 2).pptx
Management Information System(Unit 2).pptxManagement Information System(Unit 2).pptx
Management Information System(Unit 2).pptxSURBHI SAROHA
Ā 
Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)SURBHI SAROHA
Ā 
Management Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptxManagement Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptxSURBHI SAROHA
Ā 
Introduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptxIntroduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptxSURBHI SAROHA
Ā 
OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)SURBHI SAROHA
Ā 
OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)SURBHI SAROHA
Ā 
Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)SURBHI SAROHA
Ā 
Database Management System(UNIT 1)
Database Management System(UNIT 1)Database Management System(UNIT 1)
Database Management System(UNIT 1)SURBHI SAROHA
Ā 
Object-Oriented Programming with Java UNIT 1
Object-Oriented Programming with Java UNIT 1Object-Oriented Programming with Java UNIT 1
Object-Oriented Programming with Java UNIT 1SURBHI SAROHA
Ā 
Database Management System(UNIT 1)
Database Management System(UNIT 1)Database Management System(UNIT 1)
Database Management System(UNIT 1)SURBHI SAROHA
Ā 
OOPs & C++ UNIT 3
OOPs & C++ UNIT 3OOPs & C++ UNIT 3
OOPs & C++ UNIT 3SURBHI SAROHA
Ā 

More from SURBHI SAROHA (20)

Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2
Ā 
Management Information System(Unit 2).pptx
Management Information System(Unit 2).pptxManagement Information System(Unit 2).pptx
Management Information System(Unit 2).pptx
Ā 
Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)
Ā 
Management Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptxManagement Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptx
Ā 
Introduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptxIntroduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptx
Ā 
JAVA (UNIT 5)
JAVA (UNIT 5)JAVA (UNIT 5)
JAVA (UNIT 5)
Ā 
DBMS (UNIT 5)
DBMS (UNIT 5)DBMS (UNIT 5)
DBMS (UNIT 5)
Ā 
DBMS UNIT 4
DBMS UNIT 4DBMS UNIT 4
DBMS UNIT 4
Ā 
JAVA(UNIT 4)
JAVA(UNIT 4)JAVA(UNIT 4)
JAVA(UNIT 4)
Ā 
OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)
Ā 
OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)
Ā 
DBMS UNIT 3
DBMS UNIT 3DBMS UNIT 3
DBMS UNIT 3
Ā 
JAVA (UNIT 3)
JAVA (UNIT 3)JAVA (UNIT 3)
JAVA (UNIT 3)
Ā 
Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)
Ā 
DBMS (UNIT 2)
DBMS (UNIT 2)DBMS (UNIT 2)
DBMS (UNIT 2)
Ā 
JAVA UNIT 2
JAVA UNIT 2JAVA UNIT 2
JAVA UNIT 2
Ā 
Database Management System(UNIT 1)
Database Management System(UNIT 1)Database Management System(UNIT 1)
Database Management System(UNIT 1)
Ā 
Object-Oriented Programming with Java UNIT 1
Object-Oriented Programming with Java UNIT 1Object-Oriented Programming with Java UNIT 1
Object-Oriented Programming with Java UNIT 1
Ā 
Database Management System(UNIT 1)
Database Management System(UNIT 1)Database Management System(UNIT 1)
Database Management System(UNIT 1)
Ā 
OOPs & C++ UNIT 3
OOPs & C++ UNIT 3OOPs & C++ UNIT 3
OOPs & C++ UNIT 3
Ā 

Recently uploaded

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
Ā 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
Ā 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
Ā 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
Ā 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
Ā 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
Ā 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Dr. Mazin Mohamed alkathiri
Ā 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 šŸ’ž Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 šŸ’ž Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 šŸ’ž Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 šŸ’ž Full Nigh...Pooja Nehwal
Ā 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
Ā 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
Ā 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
Ā 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
Ā 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
Ā 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
Ā 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
Ā 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
Ā 

Recently uploaded (20)

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
Ā 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
Ā 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
Ā 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
Ā 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
Ā 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
Ā 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
Ā 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 šŸ’ž Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 šŸ’ž Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 šŸ’ž Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 šŸ’ž Full Nigh...
Ā 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
Ā 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
Ā 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
Ā 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Ā 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
Ā 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Ā 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Ā 
CĆ³digo Creativo y Arte de Software | Unidad 1
CĆ³digo Creativo y Arte de Software | Unidad 1CĆ³digo Creativo y Arte de Software | Unidad 1
CĆ³digo Creativo y Arte de Software | Unidad 1
Ā 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
Ā 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
Ā 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
Ā 

Pattern recognition UNIT 5

  • 2. SYLLABUS ā€¢ Introduction ā€¢ Design principles of pattern recognition system ā€¢ Statistical Pattern recognition ā€¢ Parameter estimation methods ā€¢ Principle Component Analysis(PCA) ā€¢ Linear Discriminant Analysis(LDA) ā€¢ Classification Techniques ā€¢ Nearest Neighbor(NN) Rule
  • 3. Contā€¦ ā€¢ Bayes Classifier ā€¢ Support Vector Machine(SVM) ā€¢ K-means clustering
  • 4. Pattern Recognition ā€¢ Pattern recognition is the automated recognition of patterns and regularities in data. ā€¢ It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning. ā€¢ Pattern recognition has its origins in statistics and engineering; some modern approaches to pattern recognition include the use of machine learning, due to the increased availability of big data and a new abundance of processing power. ā€¢ However, these activities can be viewed as two facets of the same field of application, and together they have undergone substantial development over the past few decades. ā€¢ The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories.[1]
  • 5. Design principles of pattern recognition system ā€¢ Pattern Recognition System Pattern is everything around in this digital world. ā€¢ A pattern can either be seen physically or it can be observed mathematically by applying algorithms. ā€¢ In Pattern Recognition, pattern is comprises of the following two fundamental things: ā€¢ Collection of observations ā€¢ The concept behind the observation
  • 6.
  • 7. Contā€¦ ā€¢ Design Principles of Pattern Recognition In pattern recognition system, for recognizing the pattern or structure two basic approaches are used which can be implemented in diferrent techniques. These are ā€“ ā€“ Statistical Approach and ā€“ Structural Approach ā€¢ Statistical Approach: Statistical methods are mathematical formulas, models, and techniques that are used in the statistical analysis of raw research data. ā€¢ The application of statistical methods extracts information from research data and provides different ways to assess the robustness of research outputs. ā€“ Two main statistical methods are used :Descriptive Statistics: It summarizes data from a sample using indexes such as the mean or standard deviation. ā€“ Inferential Statistics: It draw conclusions from data that are subject to random variation.
  • 8. Statistical Pattern recognition ā€¢ Structural Approach: The Structural Approach is a technique wherein the learner masters the pattern of sentence. Structures are the different arrangements of words in one accepted style or the other. ā€“ Types of structures:Sentence Patterns ā€“ Phrase Patterns ā€“ Formulas ā€“ Idioms
  • 9. Difference Between Statistical Approach and Structural Approach:
  • 10. Parameter estimation methods ā€¢ The term parameter estimation refers to the process of using sample data (in reliability engineering, usually times-to-failure or success data) to estimate the parameters of the selected distribution. ā€¢ Several parameter estimation methods are available. ā€¢ More specifically, we start with the relatively simple method of Probability Plotting and continue with the more sophisticated methods of Rank Regression (or Least Squares), Maximum Likelihood Estimation and Bayesian Estimation Methods.
  • 11. Probability Plotting ā€¢ The least mathematically intensive method for parameter estimation is the method of probability plotting. ā€¢ As the term implies, probability plotting involves a physical plot of the data on specially constructed probability plotting paper. This method is easily implemented by hand, given that one can obtain the appropriate probability plotting paper. ā€¢ The method of probability plotting takes the cdf of the distribution and attempts to linearize it by employing a specially constructed paper. The following sections illustrate the steps in this method using the 2- parameter Weibull distribution as an example. This includes: ā€¢ Linearize the unreliability function ā€¢ Construct the probability plotting paper ā€¢ Determine the X and Y positions of the plot points ā€¢ And then using the plot to read any particular time or reliability/unreliability value of interest.
  • 12. Methods of Parameter Estimation ā€¢ The techniques used for parameter estimation are called estimators. ā€¢ Some estimators are: ā€¢ Probability Plotting: A method of finding parameter values where the data is plotted on special plotting paper and parameters are derived from the visual plot ā€¢ Rank Regression (Least Squares): A method of finding parameter values that minimizes the sum of the squares of the residuals. ā€¢ Maximum Likelihood Estimation: A method of finding parameter values that, given a set of observations, will maximize the likelihood function. ā€¢ Bayesian Estimation Methods: A family of estimation methods that tries to minimize the posterior expectation of what is called the utility function. In practice, what this means is that existing knowledge about a situation is formulated, data is gathered, and then posterior knowledge is used to update our beliefs.
  • 13. Principle Component Analysis(PCA) ā€¢ Principal component analysis (PCA) is a technique used to emphasize variation and bring out strong patterns in a dataset. ā€¢ It's often used to make data easy to explore and visualize. ā€¢ 2D example ā€¢ First, consider a dataset in only two dimensions, like (height, weight). ā€¢ This dataset can be plotted as points in a plane. ā€¢ But if we want to tease out variation, PCA finds a new coordinate system in which every point has a new (x,y) value. ā€¢ The axes don't actually mean anything physical; they're combinations of height and weight called "principal components" that are chosen to give one axes lots of variation.
  • 14. Contā€¦ ā€¢ 3D example ā€¢ With three dimensions, PCA is more useful, because it's hard to see through a cloud of data. ā€¢ In the example below, the original data are plotted in 3D, but you can project the data into 2D through a transformation no different than finding a camera angle: rotate the axes to find the best angle. ā€¢ To see the "official" PCA transformation, click the "Show PCA" button. ā€¢ The PCA transformation ensures that the horizontal axis PC1 has the most variation, the vertical axis PC2 the second-most, and a third axis PC3 the least.
  • 15. Linear Discriminant Analysis(LDA) ā€¢ Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. ā€¢ The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification. ā€¢ LDA is closely related to analysis of variance (ANOVA) and regression analysis, which also attempt to express one dependent variable as a linear combination of other features or measurements. ā€¢ However, ANOVA uses categorical independent variables and a continuous dependent variable, whereas discriminant analysis has continuous independent variables and a categorical dependent variable (i.e. the class label). ā€¢ Logistic regression and probit regression are more similar to LDA than ANOVA is, as they also explain a categorical variable by the values of continuous independent variables. These other methods are preferable in applications where it is not reasonable to assume that the independent variables are normally distributed, which is a fundamental assumption of the LDA method.
  • 16. Contā€¦ ā€¢ LDA is also closely related to principal component analysis (PCA) and factor analysis in that they both look for linear combinations of variables which best explain the data. ā€¢ LDA explicitly attempts to model the difference between the classes of data. ā€¢ PCA, in contrast, does not take into account any difference in class, and factor analysis builds the feature combinations based on differences rather than similarities. ā€¢ Discriminant analysis is also different from factor analysis in that it is not an interdependence technique: a distinction between independent variables and dependent variables (also called criterion variables) must be made. ā€¢ LDA works when the measurements made on independent variables for each observation are continuous quantities. ā€¢ When dealing with categorical independent variables, the equivalent technique is discriminant correspondence analysis.
  • 17. Classification Techniques ā€¢ Various types of classification algorithms: ā€¢ Logistic Regression ā€¢ Naive Bayes Classifier ā€¢ K-Nearest Neighbors ā€¢ Decision Tree ā€“ Random Forest ā€¢ Support Vector Machines
  • 18. Contā€¦ ā€¢ Logistic Regression ā€¢ Logistic regression is a calculation used to predict a binary outcome: either something happens, or does not. This can be exhibited as Yes/No, Pass/Fail, Alive/Dead, etc. ā€¢ Naive Bayes Classifier ā€¢ Naive Bayes calculates the possibility of whether a data point belongs within a certain category or does not. In text analysis, it can be used to categorize words or phrases as belonging to a preset ā€œtagā€ (classification) or not.
  • 19. Contā€¦. ā€¢ K-nearest Neighbors ā€¢ K-nearest neighbors (k-NN) is a pattern recognition algorithm that uses training datasets to find the k closest relatives in future examples. ā€¢ When k-NN is used in classification, you calculate to place data within the category of its nearest neighbor. If k = 1, then it would be placed in the class nearest 1. K is classified by a plurality poll of its neighbors.
  • 20. Contā€¦. ā€¢ Decision Tree ā€¢ A decision tree is a supervised learning algorithm that is perfect for classification problems, as itā€™s able to order classes on a precise level. ā€¢ It works like a flow chart, separating data points into two similar categories at a time from the ā€œtree trunkā€ to ā€œbranches,ā€ to ā€œleaves,ā€ where the categories become more finitely similar. ā€¢ This creates categories within categories, allowing for organic classification with limited human supervision.
  • 21.
  • 22. Random Forest ā€¢ The random forest algorithm is an expansion of decision tree, in that, you first construct some- axis real-world decision trees with training data, then fit your new data within one of the trees as a ā€œrandom forest.ā€ ā€¢ Support Vector Machines ā€¢ A support vector machine (SVM) uses algorithms to train and classify data within degrees of polarity, taking it to a degree beyond X/Y prediction.
  • 23. Nearest Neighbor(NN) Rule ā€¢ K-Nearest Neighbors is one of the most basic yet essential classification algorithms in Machine Learning. ā€¢ It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining and intrusion detection. ā€¢ It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying assumptions about the distribution of data (as opposed to other algorithms such as GMM, which assume a Gaussian distribution of the given data).
  • 24. Contā€¦ ā€¢ In statistics, the k-nearest neighbors algorithm (k-NN) is a non- parametric machine learning method first developed by Evelyn Fix and Joseph Hodges in 1951,and later expanded by Thomas Cover. ā€¢ It is used for classification and regression. ā€¢ In both cases, the input consists of the k closest training examples in feature space. The output depends on whether k-NN is used for classification or regression: ā€¢ In k-NN classification, the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. ā€¢ In k-NN regression, the output is the property value for the object. This value is the average of the values of k nearest neighbors.
  • 25. Bayes Classifier ā€¢ A Naive Bayes classifier is a probabilistic machine learning model thatā€™s used for classification task. ā€¢ The crux of the classifier is based on the Bayes theorem. ā€¢ Bayes Theorem: ā€¢ Using Bayes theorem, we can find the probability of A happening, given that B has occurred. Here, B is the evidence and A is the hypothesis. The assumption made here is that the predictors/features are independent. That is presence of one particular feature does not affect the other. Hence it is called naive.
  • 26. Types of Naive Bayes Classifier: ā€¢ Multinomial Naive Bayes: ā€¢ This is mostly used for document classification problem, i.e whether a document belongs to the category of sports, politics, technology etc. The features/predictors used by the classifier are the frequency of the words present in the document. ā€¢ Bernoulli Naive Bayes: ā€¢ This is similar to the multinomial naive bayes but the predictors are boolean variables. The parameters that we use to predict the class variable take up only values yes or no, for example if a word occurs in the text or not. ā€¢ Gaussian Naive Bayes: ā€¢ When the predictors take up a continuous value and are not discrete, we assume that these values are sampled from a gaussian distribution.
  • 27. Support Vector Machine(SVM) ā€¢ In machine learning, support-vector machines (SVMs, also support- vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. ā€¢ Developed at AT&T Bell Laboratories by Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Vapnik et al., 1997), SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik and Chervonenkis (1974) and Vapnik (1982, 1995). ā€¢ Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier (although methods such as Platt scaling exist to use SVM in a probabilistic classification setting).
  • 28. Contā€¦ ā€¢ An SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. ā€¢ Support vector machines (SVMs) are powerful yet flexible supervised machine learning algorithms which are used both for classification and regression. ā€¢ But generally, they are used in classification problems. ā€¢ In 1960s, SVMs were first introduced but later they got refined in 1990. ā€¢ SVMs have their unique way of implementation as compared to other machine learning algorithms. ā€¢ Lately, they are extremely popular because of their ability to handle multiple continuous and categorical variables.
  • 29. Working of SVM ā€¢ An SVM model is basically a representation of different classes in a hyperplane in multidimensional space. ā€¢ The hyperplane will be generated in an iterative manner by SVM so that the error can be minimized. ā€¢ The goal of SVM is to divide the datasets into classes to find a maximum marginal hyperplane (MMH).
  • 30. The followings are important concepts in SVM āˆ’ ā€¢ Support Vectors āˆ’ Datapoints that are closest to the hyperplane is called support vectors. Separating line will be defined with the help of these data points. ā€¢ Hyperplane āˆ’ As we can see in the above diagram, it is a decision plane or space which is divided between a set of objects having different classes. ā€¢ Margin āˆ’ It may be defined as the gap between two lines on the closet data points of different classes. It can be calculated as the perpendicular distance from the line to the support vectors. Large margin is considered as a good margin and small margin is considered as a bad margin.
  • 31.
  • 32. K-means clustering ā€¢ We are given a data set of items, with certain features, and values for these features (like a vector). ā€¢ The task is to categorize those items into groups. ā€¢ To achieve this, we will use the kMeans algorithm; an unsupervised learning algorithm.
  • 33. Contā€¦. ā€¢ (It will help if you think of items as points in an n- dimensional space). The algorithm will categorize the items into k groups of similarity. To calculate that similarity, we will use the euclidean distance as measurement. ā€¢ The algorithm works as follows: ā€¢ First we initialize k points, called means, randomly. ā€¢ We categorize each item to its closest mean and we update the meanā€™s coordinates, which are the averages of the items categorized in that mean so far. ā€¢ We repeat the process for a given number of iterations and at the end, we have our clusters.
  • 34. The above algorithm in pseudocode: ā€¢ Initialize k means with random values ā€¢ For a given number of iterations: ā€¢ Iterate through items: ā€¢ Find the mean closest to the item ā€¢ Assign item to mean ā€¢ Update mean