SlideShare a Scribd company logo
1
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
01725-402592
2
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and
regularities in data, although it is in some cases considered to be nearly synonymous with machine learning.
Pattern recognition systems are in many cases trained from labeled "training" data.
Pattern recognition is the scientific discipline that concerns the description and classification of patterns.
 Decision making
 Object and pattern recognition.
Pattern Recognition applications
Build a machine that can recognize patterns:
 Speech recognition
 Fingerprint identification
 OCR (Optical Character Recognition)
 DNA sequence identification
 Text Classification
Basic Structure
The task of the pattern recognition system is to classify an object into a correct class based on the
measurements about the object. Note that possible classes are usually well-defined already before the design
of the pattern recognition system. Many pattern recognition systems can be thought to consist of five stages:
1. Sensing (measurement);
2. Pre-processing and segmentation;
3. Feature extraction;
4. Classification;
5. Post-processing
Sensing
Sensing refers to some measurement or observation about the object to be classified. For example, the data
can consist of sounds or images and sensing equipment can be a microphone array or a camera.
Pre-processing
Pre-processing refers to filtering the raw data for noise suppression and other operations performed on the
raw data to improve its quality. In segmentation, the measurement data is partitioned so that each part
represents exactly one object to be classified. For example in address recognition, an image of the whole
address needs to be divided to images representing just one character.
3
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Feature extraction
Feature extraction, especially when dealing with pictorial information the amount of data per one object can
be huge. A high resolution facial photograph (for face recognition) can contain 1024*1024 pixels.
Classification
The classifier takes as an input the feature vector extracted from the object to be classified. It places then the
feature vector (i.e. the object) to class that is the most appropriate one. In address recognition, the classifier
receives the features extracted from the sub-image containing just one character and places it to one of the
following classes: ‟A‟,‟B‟,‟C‟..., ‟0‟,‟1‟,...,‟9‟. The classifier can be thought as a mapping from the feature
space to the set of possible classes.
Post-processing
A pattern recognition system rarely exists in a vacuum. The final task of the pattern recognition system is to
decide upon an action based on the classification result(s). A simple example is a bottle recycling machine,
which places bottles and cans to correct boxes for further processing.
The Design Cycle
• Data collection
• Feature Choice
• Model Choice
• Training
• Evaluation
• Computational Complexity
Data Collection
How do we know when we have collected an adequately large and representative set of examples for
training and testing the system?
Feature Choice
Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant
transformation insensitive to noise
Model Choice
Unsatisfied with the performance of our fish classifier and want to jump to another class of model.
Training
Use data to determine the classifier. Many different procedures for training classifiers and choosing models
4
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Evaluation
 Measure the error rate
 Different feature set
 Different training methods
 Different training and test data sets
Computational Complexity
What is the trade-off between computational ease and performance?
Statistical Decision Making
Parametric Decision Making
In which we know or are willing to assume the general form of the probability distribution function or
density function for each class, but not the values of the parameters such as mean or variance.
Non Parametric Decision Making
When we do not have sufficient basis of assuming even the general form of the relevant densities.
Bayes’ Theorem
• Bayesian decision making refers to choosing the most likely class, given the value of the feature or
features.
• The probability of class membership is calculated from Bayes‟ Theorem.
• Let feature value is x and a class of interest is C
• Then P(x) is the probability distribution of x in the entire population.
• P(C) is the prior probability that a random sample is a member of class C.
• P (x|C) is the conditional probability of obtaining x given that the sample is from C class.
• We have to estimate the probability P (C|x) that a sample belongs to class C, given that it has the
feature x.
• Conditional Probability
• The probability of occurring A given That B has occurred is denoted by P (A|B), and is read as “P of
A given B”.
• Since we know in advance that B has occurred, so P (A|B) is the fraction of B in which A occurs.
Thus
5
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The conditional probability of a sample comes from class C and has the feature value x is
• Rearranging
• Which is known as Bayes‟ Theorem? The variable x can represent a single feature or a feature
vector.
Bayes’ Theorem for k-classes
• Let C1… Ck are mutually exclusive i.e., they will not overlap each other and every sample belongs to
exactly one of the classes.
• If a sample belongs to one of the classes A or B, or both or neither, then four new mutually exclusive
classes C1 ,C2 ,C3 ,and C4 defined by
C1 = A and B C2 = A and B
C3 = A and B C4= A and B
• Thus k-nonexclusive classes could define up to 2k
mutually exclusive classes.
• Bayes Theorem for multiple features is obtained by replacing the value of a single feature x by the
value of a feature vector x.
• In the discrete case, if there are k classes we obtain
A A+B B
)(
)(
)|(
BP
BandAP
BAP 
)(
)(
)|(
AP
AandBP
ABP 
)|()()( BAPBPBandAP  )|()()( ABPAPAandBP 
)|()()|()()( xCPxPCxPCPxandCP 
)(
)|()(
)|(
xP
CxPCP
xCP 
6
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Nonparametric Decision Making
Nearest Neighbor Classification Techniques
The single Nearest Neighbor Technique
• Beyond of the problem of probability densities, the single Nearest Neighbor Technique completely
and simply classifies an unknown sample as belonging to the relevant class as the most similar or
“nearest” sample point in the training set of data, which is often called a reference set.
• Nearest can mean the smallest Euclidean distance in n-dimensional feature space, which is the
distance between two points
And
• Defined by
• Where n is number of features.
• Although Euclidean distance is the most commonly used measure of dissimilarity / similarity
between feature vectors, it is not always the best metric.
• Before summation, squaring the distance places emphasis on features with large dissimilarity.
• A more moderate approach is simply the sum of the absolute differences in each feature, and saves
computing time.
• The distance metric would then be
• The sum of absolute distances is sometimes called the city block distance, the Manhattan metric, or
the taxi-cab distance.
)...,.........( 1 naaa
)..,.........( 1 nbbb


n
i
iie abd
1
2
)()( b,a
||)(
1
i
n
i
icb abd  
b,a
7
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Because it seems the distance between two locations in a city. If in a two-way street of rectangular
shape, the number of blocks north (or south) plus the number of block east (or west) would equal the
total distance traveled.
• An extreme metric which considers only the most dissimilar pair of features is the Maximum
distance metric
• A generalization of the three distances is the Minkowski distance defined by
• Where r is an adjustable parameter
Clustering
• Clustering refers to the process of grouping samples so that the samples are similar within each
group. The groups are called clusters.
• Clustering can be classified into two major types, Hierarchical and Partitioned clustering.
Hierarchical clustering algorithms can be further divided into agglomerative and divisive.
• Hierarchical clustering refers to a process that organizes data into large groups, which contain
smaller groups, and so on.
• Hierarchical clustering usually drawn pictorially by a tree or dendrogram in which the finest
grouping is at the bottom, each sample forms a cluster.
• Below is an example of a dendrogram
• Hierarchical clustering algorithms are called agglomerative if they build the dendrogram from the
bottom up and they are called divisive if they build the dendrogram from the top down.
||max)(
1
ii
n
i
m abd 

b,a
rn
i
r
iir abd
1
1
)( 





 
b,a
8
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Agglomerative clustering algorithms with n number of samples is as below
• Begin with n clusters, each consisting of one sample.
• Repeat step 3 a total of n-1 times.
• Find the most similar clusters Ci and Cj and merge Ci and Cj into one cluster. If there is a tie, merge
the first pair found.
Hierarchical Clustering
• One way to measure the similarity between clusters is to define a function that measures the distance
between clusters.
• In cluster analysis nearest neighbor techniques are used to measure the distance between pairs of
samples.
The Single-Linkage Algorithm
• It is also known as the minimum method or the nearest neighbor method.
• The Single-Linkage Algorithm is obtained by defining the distance between two clusters to be the
smallest distance between two points such that one point is in each cluster.
• Formally, if Ci and Cj are clusters, the distance between them is defined as
• Where d (a,b) denotes the distance between the samples a and b.
Hierarchical Clustering: The Single-Linkage Algorithm Example
• Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the
distance between two samples.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
9
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The smallest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of
clusters become four : {1,2}, {3}, {4}, {5}
{1,2} 3 4 5
{1,2} - 8.1 16.0 17.9
3 8.1 - 9.8 9.8
4 16.0 9.8 - 8.0
5 17.9 9.8 8.0 -
• The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for S L Algorithm the distance between clusters
{1,2} and {3} is the minimum 8.1 and so on.
• Since the minimum value in the matrix is 8, clusters {4} & {5} are merged.
• Thus in this level, There are three clusters: {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 8.1 16.0
3 8.1 - 9.8
{4,5} 16.0 9.8 -
• Since the minimum value in this step is 8.1, thus clusters {1, 2} and {3} are merged. Now there are
two clusters: {1, 2, 3} and {4, 5}.
• The next step will merge the two remaining clusters at a distance of 9.8. Finally the dendrogram is as
below.
1 2 3 4 5
1 - 4.0 11.7 20.0 21.5
2 4.0 - 8.1 16.0 17.9
3 11.7 8.1 - 9.8 9.8
4 20.0 16.0 9.8 - 8.0
5 21.5 17.9 9.8 8.0 -
10
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Hierarchical Clustering
The Complete-Linkage Algorithm
• It is also known as the maximum method or the farthest neighbor method.
• And is obtained by defining the distance between two clusters to be the largest distance between a
sample in one cluster and that in other cluster.
• Formally, if Ci and Cj are clusters, we define
Hierarchical Clustering: The Complete-Linkage Algorithm Example
• Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the
distance between two samples.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
11
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The nearest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of
clusters become four : {1,2}, {3}, {4}, {5}
{1,2} 3 4 5
{1,2} - 11.7 20.0 21.5
3 11.7 - 9.8 9.8
4 20.0 9.8 - 8.0
5 21.5 9.8 8.0 -
• The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for C L Algorithm the distance between clusters
{1,2} and {3} is the Maximum 11.7 and so on.
• Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged.
• Thus in this level, There are three clusters: {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 11.7 21.5
3 11.7 - 9.8
{4,5} 21.5 9.8 -
• Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are
two clusters: {1, 2} and {3, 4, 5}.
• The next step will merge the last two clusters at a distance of 21.5.
The Average-Linkage Algorithm
• The Average-Linkage Algorithm is a compromise between the extremes of the single- and complete-
linkage algorithms.
• It is also known as the unweighted pairgroup method using arithmetic averages (UPGMA).
• And is obtained by defining the distance between two clusters to be the average distance between a
sample in one cluster and that in other cluster.
1 2 3 4 5
1 - 4.0 11.7 20.0 21.5
2 4.0 - 8.1 16.0 17.9
3 11.7 8.1 - 9.8 9.8
4 20.0 16.0 9.8 - 8.0
5 21.5 17.9 9.8 8.0 -
12
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Formally, if Ci with ni members and Cj with nj members are clusters, we define
• After the first table of past example, the clusters in second step was {1,2}, {3}, {4}, {5}. In this step,
for A L Algorithm, the distance between clusters {1,2} and {3} will be the average of the distances
d(1,3)=11.7 and d(2,3)=8.1, and so on.
{1,2} 3 4 5
{1,2} - 9.9 18.0 19.7
3 9.9 - 9.8 9.8
4 18 9.8 - 8.0
5 19.7 9.8 8.0 -
• Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged. Thus now the
clusters are {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 9.9 18.9
3 9.9 - 9.8
{4,5} 18.9 9.8 -
• Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are
two clusters: {1, 2} and {3, 4, 5}.
• The next step will merge the last two clusters at a distance of 14.4.
Hierarchical Clustering: Ward’s Method
• Word‟s Method is also called the minimum-variance method. It begins with one cluster for each
sample.
• At each iteration, among all cluster pairs, it merges the pair that produces the smallest squared error
for the resulting set of clusters. The squared error for each cluster is defined as follows:
• Let a cluster contains m samples x1,….,xm where xi is the feature vector (xi1,….,xid)
13
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The vector composed of the means of each feature
is called the mean vector or centroid of the cluster.
• The squared error for a cluster is the sum of the squared distances in each feature from the cluster
members to their mean.
• The squared error is thus equal to the total variance of the cluster times the number of
samples in the cluster m, where the total variance is defined to be
the sum of the variances of each feature. The squared error for a set of clusters is defined to be the
sum of the squared errors for the individual clusters.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
• Example: Begin with five cluster, one sample in each. The squared error is 0, 10 possible ways to
merge a pair of clusters: merge {1} & {2}, merge {1} & {3}, and so on.
• Let merging {1} and {2}, feature vector of sample 1 is (4,4) & feature vector of sample 2 is (8,4), so
feature means are 6 & 4. The squared error for cluster {1,2}:
14
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The squared error for cluster {3}, {4}, {5} is 0. Thus the total squared error for the clusters
{1,2},{3},{4},{5}:
• 8+0+0+0=8.
Clusters Squared
Error, E
{1,2},{3},{4},{5} 8.0
{1,3},{2},{4},{5} 68.5
{1,4},{2},{3},{5} 200.0
{1,5},{2},{3},{4} 232.0
{2,3},{1},{4},{5} 32.5
{2,4},{1},{3},{5} 128.0
{2,5},{1},{3},{4} 160.0
{3,4},{1},{2},{5} 48.5
{3,5},{1},{2},{4} 48.5
{4,5},{1},{2},{3} 32.0
• Since minimum error is 8, so merging {1, 2}, {3}, {4}, {5} is accepted.
Clusters Squared
Error, E
{1,2,3},{4},{5} 72.7
{1,2,4},{3},{5} 224.0
{1,2,5},{3},{4} 266.7
{1,2},{3,4},{5} 56.5
{1,2},{3,5},{4} 56.5
{1,2},{4,5},{3} 40.0
• There are 6 possible sets of clusters resulting from {1, 2}, {3}, {4}, {5}.
• From the table shown, the minimum squared error is 40 and it is for {1,2},{4,5},{3}
• There are 3 possible sets of clusters resulting from {1,2},{4,5},{3}.
• From the table shown, the minimum squared error is 94 and it is for {1,2},{3,4,5}
15
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• At Last, Two remaining clusters are merged and Hierarchical clustering is complete.
Clusters Squared
Error, E
{1,2,3},{4,5} 104.7
{1,2,4,5},{3} 380.0
{1,2},{3,4,5} 94.0
• The resulting dendrogram is shown as below:
Partitional Clustering
• In partitional clustering, the goal is usually to create one set of clusters that partitions the data into
similar groups.
• Samples close to one another are assumed to be similar and the task is to group data that are closed
together.
• In many cases, the number of clusters to be constructed is specified in advance.
• If a partitional clustering algorithm divide the data set into two groups, then each of these is further
divided into two parts, and so on, a hierarchical dendrogram could be produced from the top-down.
• The hierarchy produced by this divisive technique is more general than the bottom-up hierarchies
because the groups can be divided into more than two subgroups in one step.
• Another advantage of partitional techniques is that only the top part of the tree which shows the
main groups and possibly their subgroups, may be required, and there may be no need to complete
dendrogram.
Partitional Clustering: Forgy’s Algorithm
• Besides the data, input to the algorithm consists of k, the number of clusters to be constructed, and k
samples called seed points. The seed points could be chosen randomly, or some knowledge of the
desired cluster structure could be used to guide their selection.
16
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Step-1. Initialize the cluster centroids to the seed points.
• Step-2. For each sample, find the cluster centroid nearest it. put the sample in the cluster identified
with this nearest cluster centroid.
• Step-3. If no samples changed clusters in step 2, stop.
• Step-4. Compute the centroids of the resurting clusters and go to step 2.
Forgy’s Algorithm: Example
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
• Set k=2 which will produce two clusters, and use the first two samples (4,4) and (8,4) in the list as
seed points.
• In this algorithm, the samples will be denoted by their feature vectors rather than their simple
numbers to aid in the computation.
• For step 2, find the nearest cluster centroid for each sample.
Sample Nearest
cluster
centroid
(4,4) (4,4)
(8,4) (8,4)
(15,8) (8,4)
(24,4) (8,4)
(24,12) (8,4)
• The ctusters {(4, 4)} and {(8,4), (15,8), (24,4), (24,12)} are produced.
• For step 4, compute the centroids of the clusters. The centroid of the first and second clusters are
(4,4) and (17.75,7) since (8+15+24+24)/4=17.75 (4+8+4+12)/4=7
Sample Nearest
17
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
cluster
centroid
(4,4) (4,4)
(8,4) (4,4)
(15,8) (17.75,7)
(24,4) (17.75,7)
(24,12) (17.75,7)
• Some sample changed cluster, return to step-2
• Resulting table shows the results. The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are
produced.
• Again for step 4, compute the centroids (6,4) and (21, 8) of the clusters. Since the sample (8, 4)
changed clusters, return to step 2.
Sample Nearest
cluster
centroid
(4,4) (6,4)
(8,4) (6,4)
(15,8) (21, 8)
(24,4) (21, 8)
(24,12) (21, 8)
• Find the cluster centroid nearest each sample. Table shows the results.
• The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are obtained.
• For step 4, compute the centroids (6, 4) and (21, 8) of the clusters.
• Since no sample will change clusters, the algorithm terminates.
18
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Partitional Clustering: k-means Algorithm
• An alternative version of the, k-means algorithm iterates step 2. Specifically step-2 is replaced by the
following steps 2 through 4:
• 2. For each sample, find the centroid nearest it. Put the sarnple in the cluster identified with this
nearest centroid.
• 3. If no samples changed clusters, stop
• 4. Recompute the centroids of altered clusters and go to step 2.
K-means Algorithm: Example
• Set k: 2 and assume that the data are ordered so that the first two sarnples are (8,4) and (24,4).
• For step 1, begin with two clusters {(8,4)} and {(24,4)} which have centroids at (8,4) and (24,4). For
each of the remaining three sa,rnples, find the centroid nearest it, put the sample in this cluster, and
recompute the centroid of this cluster.
• The next sample (15, 8) is nearest the centroid (8,4) so it joins cluster {(8,4)}.
• At this point, the clusters are {(8,4),(15,8)} and {(24,4)}. The centroid of the first 3 cluster is
updated to (11.5, 6) since (8+15)/2=1.1.5, (4+8)/2=6.
• The next sample (4, 4) is nearest the centroid (11.5,6) so it joins cluster {(8,4), (15,8)}. At this point,
the clusters are {(8,4),(15,8),(4,4)} and {(24,4)}. The centroid of the first cluster is updated to (9,
5.3).
19
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The next sample (24, 12) is nearest the centroid (24,,4) so it joins cluster {(24,4)}. At this point, the
clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}. The centroid of the second cluster is
updated to (24, 8). At this point, step 1 of the algorithm is complete.
• For step 2, examine the sarnples one by one and put each one in the cluster identified with the
nearest centroid. As Table shows, in this case no sarnple changes clusters.
• The resulting clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}.
Sample Distance
to
Centroid
(9, 5.3)
Distance
to
cetroid
(24, 8)
(8, 4) 1.6 16.5
(24,4) 15.1 4.0
(15, 8) 6.6 9.0
(4,4) 6.6 40.4
(24,12) 16.4 4.0
• The goal of Forry's algorithm and the, k-means algorithm is to minimize the squared error for a fixed
number of clusters. These algorithms assign samples to clusters so as to reduce the squared error
and, in the iterative versions, they stop when no further reduction occurs.
• However, to achieve reasonable computation time, they do not consider all possible clusterings. For
this reason, they sometimes terminate with a clustering that achieves a local minimum squared error.
• Furthermore, in general, the clusterings, that these algorithms generate depend on the choice of the
seed points.
• If Forgy's algorithm is applied to the original data using (8, 4) and (24, 4) as seed points, the
algorithm terminates with the clusters {(4, 4), (8, 4), (15, 8)}, {(24, 4), (24, 12)}.
• This is different from the clustering produced in forgy‟s. The above clustering has a squared error of
104.7 whereas the Forgy‟s clustering has a squared error of 94.
• The clustering above produces a local minimum and the forgy‟s clustering can be shown to produce
a global minimum.
• For a given set of seed points, the resulting clusters may also depend on the order in which the points
are checked.
Neural Network: Introduction
20
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• It was more than 2000 years ago; our ancestors had started to discover the architecture and behavior
of human brain.
• Ramon Y. Cajal and Hebb continued the work of Aristotle and tried to build the artificial "thinking
machine".
• Based on the information about the functions of the brain and the quest for obtaining a mathematical
model for our learning habits, a new technology Artificial Neural Networks was started.
• Our brain can process information quickly and accurately. You can recognize your friend's voice in a
noisy railway station. How the brain is able to process the voice signal added with the noise and
retrieve the original signal?
• Can we duplicate this amazing process through a machine? Can we make a machine to duplicate
some learning habits of a human? Can a machine be made to learn from experience?
• We will get answer during the study of Neural Network.
Neural Network: Definition
• An artificial neural network is an information processing system that has been developed as a
generalization of the mathematical model of human cognition (sense of knowing).
• A neural network is a network of interconnected neurons, inspired from the studies of the biological
nervous system. In other words, neural network functions in a way similar to the human brain.
• The function of a neural network is to produce an output pattern when presented with an input
pattern.
• Neural network is the study of networks consisting of nodes connected by adaptable weights, which
store experimental knowledge from task examples through a process of learning.
• The nodes of the brain are adaptable; they acquire knowledge through changes in the node weights
by being exposed to samples.
Neural Network: Biological Neural Net.
• Neural network architectures are motivated by models of the human brain and nerve cells. Our
current knowledge of human brain is limited to its anatomical and physiological information.
• Neuron (from Greek, meaning nerve cell) is the fundamental unit of the brain. The neuron is a
complex biochemical and electrical signal processing unit that receives and combines signals from
many other neurons through filamentary input paths, the dendrites (Greek: tree links).
• A biological neuron has three types of components namely dendrites, soma and axon. Dendrites are
bunched into highly complex "dendritic trees", which have an enormous total surface area. The
dendrites receive signals from other neurons.
• Dendritic trees are connected with the main body of the neuron called the soma (Greek: body).
• The soma has a pyramidal or cylindrical shape. The soma sums the incoming signals. When
sufficient input is received, the cell fires.
21
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The output area of the neuron is a long fiber called axon. The impulse signal triggered by the cell is
transmitted over the axon to other cells.
• The connecting point between a neuron's axon and another neuron‟s dendrite is called a synapse
(Greek: contact). The impulse signals are then transmitted across a synaptic gap by means of a
chemical process.
• A single neuron may have 1000 to 10000 synapses and may be connected with around 1000 neurons.
There are 100 billion neurons in our brain, and each neuron has 1000 dendrites.
Neural Network: Artificial Neuron
• The artificial neuron (also called processing element or node) mimes the characteristics of the
biological neuron. A processing element possesses a local memory and carries out localized
information processing operations.
• The artificial neuron has a set of „n‟ inputs xi, each representing the output of another neuron.
• The subscript i in xi take values between i and n and indicates the source of the vector input signal.
• The inputs are collectively referred to as X.
• Each input is weighed before it reaches the main body of the processing element by the connection
strength or the weight factor (or simply weight) analogous to the synaptic strength.
• The amount of information about the input that is required to solve a problem is stored in the form of
weights. Each signal is multiplied with an associated weight w1, w2, w3... wn before it is applied to
the summing block.
• In addition, the artificial neuron has a bias term w0, a threshold value „θ „that has to be reached or
extended for the neuron to produce a signal, a nonlinear function 'F' that acts on the produced signal
'net' and an output 'y' after the nonlinearity function.
22
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The following relation describes the transfer function of the basic neuron model.
• y = F (net)
• Where
• net = w0 + x1w1 + x2w2 + x3w3 +...... + xnwn
• or
• and the neuron firing condition is:
[For linear activation function], x0=1
• Or
[For nonlinear activation function]
Neural Network: Classification
• Artificial neural networks can be classified on the basis of
1. Pattern of connection between neurons, (architecture of the network)
2. Activation function applied to the neurons
3. Method of determining weights on the connection (training method)
Neural Network: ARCHITECTURE


n
i
ii wxwnet
0
0
0i
ii wx
)(netF
23
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The neurons are assumed to be arranged in layers, and the neurons in the same layer behave in the
same manner.
• All the neurons in a layer usually have the same activation function. Within each layer, the neurons
are either fully interconnected or not connected at all.
• The neurons in one layer can be connected to neurons in another layer.
• The arrangement of neurons into layers and the connection pattern within and between layers is
known as network architecture.
Input layer:
• The neurons in this layer receive the external input signals and perform no computation, but simply
transfer the input signals to the neurons in another layer.
Output layer:
• The neurons in this layer receive signals from neurons either input layer or in the hidden layer.
Hidden layer:
• The layer of neurons that are connected in between the input layer and the output layer is known as
hidden layer.
• Neural nets are often classified as single layer networks or multilayer networks.
• The number of layers in a net can be defined as the number of layers of weighted interconnection
links between various layers.
• While determining the number of layers, the input layer is not counted as a layer, because it does not
perform any computation.
• The architecture of a single layer and a multilayer neural network is shown in the following figures.
Single Layer Network
• A single layer network consists of one layer of connection weights. The net consists of a layer of
units called input layer, which receive signals from the outside world and a layer of units called
output layer from which the response of the net can be obtained.
• This type of network can be used for pattern classification problems
24
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Multilayer Network:
• A multilayer network consists of one or more layers of units (called hidden layers) between the input
and output layers. Multilayer networks may be formed by simply cascading a group of layers; the
output of one layer provides the input to the subsequent layer.
• A multilayer net with nonlinear activation function can solve any type of problem.
• However training a multilayer neural network is very difficult.
Multilayer Network:
Neural Network: ACTIVATION FUNCTIONS
• The purpose of nonlinear activation function is to ensure that the neuron's response is bounded - that
is, the actual response of the neuron is conditioned or damped, as a result of large or small activating
stimuli and thus controllable.
• Further, in order to achieve the advantages of multilayer nets compared with the limited capabilities
of single layer networks, nonlinear functions are required.
• Different nonlinear functions are used, depending upon the paradigm and the algorithm used for
training the network.
• The various activation functions are:
• Identity function (Linear function):
• Identity function can be expressed:
f(x) = x for all x.
• Binary step function: Binary step function is defined as:
25
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
26
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Training an Artificial Neural Network
• The most important characteristic of an artificial neural network is its ability to learn.
• Generally, learning is a process by which a neural network adapts itself to a stimulus by properly
making parameter adjustments and producing a desired response.
• Learning (training) is a process in which the network adjusts its parameters the (synaptic weights) in
response to input stimuli so that the actual output response converges to the desired output response.
• When the actual output response is the same as the desired one, the network has completed the
learning phase and the network has acquired knowledge.
• Learning or training algorithms can be categorized as:
 Supervised training
 Unsupervised training
 Reinforced training
Supervised Training:
• Supervised training requires the pairing of each input vector with a target vector representing the
desired output. These two vectors are termed together as training pair.
• During the training session an input vector is applied to the net, and it results in an output vector.
• This response is compared with the target response. If the actual response differs from the target, the
net will generate an error signal.
• This error signal is then used to calculate the adjustment that should be made in the synaptic weights
so that the actual output matches the target output.
• The error minimization in this kind of training requires a supervisor or a teacher, hence the name
supervised training.
• In artificial neural networks, the calculation that is required to minimize errors depends on the
algorithm used, which is normally based on the optimization techniques.
27
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Supervised training methods are used in to perform nonlinear mapping in pattern classification nets.
Pattern association nets and multilayer neural nets.
Unsupervised Training:
• Unsupervised training is employed in self-organizing nets and it does not require a teacher.
• In this method, the input vectors of similar types are grouped without the use of training data to
specify how a typical member of each group looks or to which group a member belongs.
• During training the neural network receives input patterns and organizes these patterns into
categories. When new input pattern is applied, the neural network provides an output response
indicating the class to which the input pattern belongs.
• If a class cannot be found for the input pattern, a new class is generated.
• Even though unsupervised training does not require a teacher, it requires certain guidelines to form
groups.
• Grouping can be done based on color, shape or any other property of the object. If no guidelines are
given grouping may or may not be successful.
Reinforced Training
• Reinforced training is similar to supervised training. In this method, the teacher does not indicate
how close the actual output to the desired output is, but yields only a pass or a fail indicator. Thus,
the error signal generated during reinforced training is binary.
Mcculloch - Pitts Neuron Model
Warren McCulloch and Walter Pitts presented the first mathematical model of a single biological neuron
in 1943. This model is known as McCulloch - Pitts model.
• This model is not requiring learning or adoption and the neurons are binary activated. If the neuron
fires, it has an activation of l and otherwise, it has an activation of 0.
• The neurons are connected by excitatory or inhibitory weights. Excitatory connection has positive
weights, and inhibitory connection has negative weights.
• All the excitatory connection in a particular neuron have the same weight. Each neuron has a fixed
threshold such that if the net input to the neuron is greater than the threshold the neuron should fire.
• The threshold is set such that the inhibition is absolute. This means any non-zero inhibitory input
will prevent the neuron from firing.
28
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Implementation of McCULLOCH - PITTS Networks for logic functions
29
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
2. OR Function
3. NOT Function
30
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
4. AND NOT Function
5. XOR Function
Applications of Neural Networks
• There have been many impressive demonstrations of artificial neural networks. A few areas where
neural networks are mentioned below.
Classification
31
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Which is an important aspect in image classification? Neural successfully in a large number of
classification tasks which includes
(a) Recognition of printed or handwritten characters.
(b) Classification of SONAR and RADAR signals.
Signal Processing
• In digital communication systems, distorted signals cause inter-signal interference.
• One of the first commercial applications of ANN was to suppress noise cancellation and it was
implemented by Widrow using ADALINE.
• The ADALINE is trained to remove the noise from the telephone line signal.
Speech Recognition
• In recent years, speech recognition has received enormous attention.
• It involves three modules namely; the front end which samples the speech signals and extracts the
data.
• The word processor, finds the probability of words in the vocabulary.
• The sentence processor, to determine the sense in the sentence.
McCULLOCH – PITTS: NOT Function
• Medicine
• Intelligent control
• Function Approximation
• Financial Forecasting
• Condition Monitoring
• Process Monitoring and Control
• Neuro Forecasting
• Pattern Analysis

More Related Content

What's hot

Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
Srinivasan R
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
Jaclyn Kokx
 
Applications of Machine Learning
Applications of Machine LearningApplications of Machine Learning
Applications of Machine Learning
Hayim Makabee
 
BTech Pattern Recognition Notes
BTech Pattern Recognition NotesBTech Pattern Recognition Notes
BTech Pattern Recognition Notes
Ashutosh Agrahari
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
Learnbay Datascience
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
Rania H
 
Cohen sutherland line clipping
Cohen sutherland line clippingCohen sutherland line clipping
Cohen sutherland line clipping
Mani Kanth
 
Computer graphics presentation
Computer graphics presentationComputer graphics presentation
Computer graphics presentation
LOKENDRA PRAJAPATI
 
Perceptron algorithm
Perceptron algorithmPerceptron algorithm
Perceptron algorithm
Zul Kawsar
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
Mohsin Ul Haq
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
Rui Pedro Paiva
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
zekeLabs Technologies
 
Decision tree Using c4.5 Algorithm
Decision tree Using c4.5 AlgorithmDecision tree Using c4.5 Algorithm
Decision tree Using c4.5 Algorithm
Mohd. Noor Abdul Hamid
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
Eng Teong Cheah
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
Syed Atif Naseem
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in ai
Robert Antony
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
Bhuvan Chopra
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Sangath babu
 
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygonsLiang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
Lahiru Danushka
 

What's hot (20)

Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Applications of Machine Learning
Applications of Machine LearningApplications of Machine Learning
Applications of Machine Learning
 
BTech Pattern Recognition Notes
BTech Pattern Recognition NotesBTech Pattern Recognition Notes
BTech Pattern Recognition Notes
 
Supervised learning
  Supervised learning  Supervised learning
Supervised learning
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Cohen sutherland line clipping
Cohen sutherland line clippingCohen sutherland line clipping
Cohen sutherland line clipping
 
Computer graphics presentation
Computer graphics presentationComputer graphics presentation
Computer graphics presentation
 
Perceptron algorithm
Perceptron algorithmPerceptron algorithm
Perceptron algorithm
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Decision tree Using c4.5 Algorithm
Decision tree Using c4.5 AlgorithmDecision tree Using c4.5 Algorithm
Decision tree Using c4.5 Algorithm
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in ai
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygonsLiang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
Liang- Barsky Algorithm, Polygon clipping & pipeline clipping of polygons
 

Viewers also liked

MTech - AI_NeuralNetworks_Assignment
MTech - AI_NeuralNetworks_AssignmentMTech - AI_NeuralNetworks_Assignment
MTech - AI_NeuralNetworks_Assignment
Vijayananda Mohire
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural Networks
guestac67362
 
Max net
Max netMax net
Hamming
HammingHamming
Neural network
Neural networkNeural network
Neural network
Santhosh Gowda
 
Paper Reading : Learning to compose neural networks for question answering
Paper Reading : Learning to compose neural networks for question answeringPaper Reading : Learning to compose neural networks for question answering
Paper Reading : Learning to compose neural networks for question answering
Sean Park
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
Urjit Patel
 
Basic Electrical Engineering
Basic Electrical EngineeringBasic Electrical Engineering
Basic Electrical Engineering
Mathankumar S
 
Neural network for machine learning
Neural network for machine learningNeural network for machine learning
Neural network for machine learning
Ujjawal
 
RAIN WATER HARVESTING
RAIN WATER HARVESTING RAIN WATER HARVESTING
RAIN WATER HARVESTING
Mathankumar S
 
Digital image processing - Image Enhancement (MATERIAL)
Digital image processing  - Image Enhancement (MATERIAL)Digital image processing  - Image Enhancement (MATERIAL)
Digital image processing - Image Enhancement (MATERIAL)
Mathankumar S
 
Recurrent Neural Network tutorial (2nd)
Recurrent Neural Network tutorial (2nd) Recurrent Neural Network tutorial (2nd)
Recurrent Neural Network tutorial (2nd)
신동 강
 
Calculating the hamming code
Calculating the hamming codeCalculating the hamming code
Calculating the hamming code
Umesh Gupta
 
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMINGFISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
Mathankumar S
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
ariffast
 
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVECBiological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
Mathankumar S
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
HIRA Zaidi
 
FISH FARMING
FISH FARMING FISH FARMING
FISH FARMING
Mathankumar S
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
AhmedMahany
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
Mathankumar S
 

Viewers also liked (20)

MTech - AI_NeuralNetworks_Assignment
MTech - AI_NeuralNetworks_AssignmentMTech - AI_NeuralNetworks_Assignment
MTech - AI_NeuralNetworks_Assignment
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural Networks
 
Max net
Max netMax net
Max net
 
Hamming
HammingHamming
Hamming
 
Neural network
Neural networkNeural network
Neural network
 
Paper Reading : Learning to compose neural networks for question answering
Paper Reading : Learning to compose neural networks for question answeringPaper Reading : Learning to compose neural networks for question answering
Paper Reading : Learning to compose neural networks for question answering
 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
 
Basic Electrical Engineering
Basic Electrical EngineeringBasic Electrical Engineering
Basic Electrical Engineering
 
Neural network for machine learning
Neural network for machine learningNeural network for machine learning
Neural network for machine learning
 
RAIN WATER HARVESTING
RAIN WATER HARVESTING RAIN WATER HARVESTING
RAIN WATER HARVESTING
 
Digital image processing - Image Enhancement (MATERIAL)
Digital image processing  - Image Enhancement (MATERIAL)Digital image processing  - Image Enhancement (MATERIAL)
Digital image processing - Image Enhancement (MATERIAL)
 
Recurrent Neural Network tutorial (2nd)
Recurrent Neural Network tutorial (2nd) Recurrent Neural Network tutorial (2nd)
Recurrent Neural Network tutorial (2nd)
 
Calculating the hamming code
Calculating the hamming codeCalculating the hamming code
Calculating the hamming code
 
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMINGFISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMING
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
 
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVECBiological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVEC
 
Back propagation network
Back propagation networkBack propagation network
Back propagation network
 
FISH FARMING
FISH FARMING FISH FARMING
FISH FARMING
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
 

Similar to Pattern recognition

Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
Rakibul Hasan Pranto
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
Nandakumar P
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Unit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptxUnit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptx
avinashBajpayee1
 
introduction to machine learning 3c-feature-extraction.pptx
introduction to machine learning 3c-feature-extraction.pptxintroduction to machine learning 3c-feature-extraction.pptx
introduction to machine learning 3c-feature-extraction.pptx
Pratik Gohel
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
IJERA Editor
 
Moviereview prjct
Moviereview prjctMoviereview prjct
Moviereview prjct
ShubhamSiddhartha
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
sandeepsandy494692
 
SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text
IJERA Editor
 
Automated attendance system based on facial recognition
Automated attendance system based on facial recognitionAutomated attendance system based on facial recognition
Automated attendance system based on facial recognition
Dhanush Kasargod
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
shesnasuneer
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
shesnasuneer
 
Clustering techniques final
Clustering techniques finalClustering techniques final
Clustering techniques final
Benard Maina
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
ShwetapadmaBabu1
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Subrata Kumer Paul
 
Digital Image Classification.pptx
Digital Image Classification.pptxDigital Image Classification.pptx
Digital Image Classification.pptx
Hline Win
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
JoonyoungJayGwak
 
CLUSTERING
CLUSTERINGCLUSTERING
CLUSTERING
Aman Jatain
 
Machine learning
Machine learningMachine learning
Machine learning
Sukhwinder Singh
 

Similar to Pattern recognition (20)

Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Unit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptxUnit-1 Introduction and Mathematical Preliminaries.pptx
Unit-1 Introduction and Mathematical Preliminaries.pptx
 
introduction to machine learning 3c-feature-extraction.pptx
introduction to machine learning 3c-feature-extraction.pptxintroduction to machine learning 3c-feature-extraction.pptx
introduction to machine learning 3c-feature-extraction.pptx
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
 
Moviereview prjct
Moviereview prjctMoviereview prjct
Moviereview prjct
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text
 
Automated attendance system based on facial recognition
Automated attendance system based on facial recognitionAutomated attendance system based on facial recognition
Automated attendance system based on facial recognition
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
Clustering techniques final
Clustering techniques finalClustering techniques final
Clustering techniques final
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
Digital Image Classification.pptx
Digital Image Classification.pptxDigital Image Classification.pptx
Digital Image Classification.pptx
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 
CLUSTERING
CLUSTERINGCLUSTERING
CLUSTERING
 
Machine learning
Machine learningMachine learning
Machine learning
 

More from Jessore University of Science & Technology, Jessore.

Compact it job solution part 01 (Preliminary)
Compact it job solution part 01 (Preliminary)Compact it job solution part 01 (Preliminary)
Compact it job solution part 01 (Preliminary)
Jessore University of Science & Technology, Jessore.
 
Distributed system
Distributed systemDistributed system
Automata Theory
Automata TheoryAutomata Theory
Software Engineering
Software EngineeringSoftware Engineering
Operating system
Operating systemOperating system
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Number system and digital device (Chapter 3)
Number system and digital device (Chapter 3)Number system and digital device (Chapter 3)
Number system and digital device (Chapter 3)
Jessore University of Science & Technology, Jessore.
 
Programming Language (chapter 5 for class 11 and 12)
Programming Language (chapter 5 for class 11 and 12)Programming Language (chapter 5 for class 11 and 12)
Programming Language (chapter 5 for class 11 and 12)
Jessore University of Science & Technology, Jessore.
 
Computer networks
Computer networksComputer networks
A to z of Cyber Crime
A to z of Cyber CrimeA to z of Cyber Crime
Syntax analysis
Syntax analysisSyntax analysis
Industrial Management
Industrial ManagementIndustrial Management
Computer Graphics
Computer GraphicsComputer Graphics
IEEE-488
IEEE-488IEEE-488
HTML-Bangla E-book
HTML-Bangla E-bookHTML-Bangla E-book

More from Jessore University of Science & Technology, Jessore. (15)

Compact it job solution part 01 (Preliminary)
Compact it job solution part 01 (Preliminary)Compact it job solution part 01 (Preliminary)
Compact it job solution part 01 (Preliminary)
 
Distributed system
Distributed systemDistributed system
Distributed system
 
Automata Theory
Automata TheoryAutomata Theory
Automata Theory
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Operating system
Operating systemOperating system
Operating system
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Number system and digital device (Chapter 3)
Number system and digital device (Chapter 3)Number system and digital device (Chapter 3)
Number system and digital device (Chapter 3)
 
Programming Language (chapter 5 for class 11 and 12)
Programming Language (chapter 5 for class 11 and 12)Programming Language (chapter 5 for class 11 and 12)
Programming Language (chapter 5 for class 11 and 12)
 
Computer networks
Computer networksComputer networks
Computer networks
 
A to z of Cyber Crime
A to z of Cyber CrimeA to z of Cyber Crime
A to z of Cyber Crime
 
Syntax analysis
Syntax analysisSyntax analysis
Syntax analysis
 
Industrial Management
Industrial ManagementIndustrial Management
Industrial Management
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer Graphics
 
IEEE-488
IEEE-488IEEE-488
IEEE-488
 
HTML-Bangla E-book
HTML-Bangla E-bookHTML-Bangla E-book
HTML-Bangla E-book
 

Recently uploaded

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 

Recently uploaded (20)

Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 

Pattern recognition

  • 1. 1 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 01725-402592
  • 2. 2 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and regularities in data, although it is in some cases considered to be nearly synonymous with machine learning. Pattern recognition systems are in many cases trained from labeled "training" data. Pattern recognition is the scientific discipline that concerns the description and classification of patterns.  Decision making  Object and pattern recognition. Pattern Recognition applications Build a machine that can recognize patterns:  Speech recognition  Fingerprint identification  OCR (Optical Character Recognition)  DNA sequence identification  Text Classification Basic Structure The task of the pattern recognition system is to classify an object into a correct class based on the measurements about the object. Note that possible classes are usually well-defined already before the design of the pattern recognition system. Many pattern recognition systems can be thought to consist of five stages: 1. Sensing (measurement); 2. Pre-processing and segmentation; 3. Feature extraction; 4. Classification; 5. Post-processing Sensing Sensing refers to some measurement or observation about the object to be classified. For example, the data can consist of sounds or images and sensing equipment can be a microphone array or a camera. Pre-processing Pre-processing refers to filtering the raw data for noise suppression and other operations performed on the raw data to improve its quality. In segmentation, the measurement data is partitioned so that each part represents exactly one object to be classified. For example in address recognition, an image of the whole address needs to be divided to images representing just one character.
  • 3. 3 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Feature extraction Feature extraction, especially when dealing with pictorial information the amount of data per one object can be huge. A high resolution facial photograph (for face recognition) can contain 1024*1024 pixels. Classification The classifier takes as an input the feature vector extracted from the object to be classified. It places then the feature vector (i.e. the object) to class that is the most appropriate one. In address recognition, the classifier receives the features extracted from the sub-image containing just one character and places it to one of the following classes: ‟A‟,‟B‟,‟C‟..., ‟0‟,‟1‟,...,‟9‟. The classifier can be thought as a mapping from the feature space to the set of possible classes. Post-processing A pattern recognition system rarely exists in a vacuum. The final task of the pattern recognition system is to decide upon an action based on the classification result(s). A simple example is a bottle recycling machine, which places bottles and cans to correct boxes for further processing. The Design Cycle • Data collection • Feature Choice • Model Choice • Training • Evaluation • Computational Complexity Data Collection How do we know when we have collected an adequately large and representative set of examples for training and testing the system? Feature Choice Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant transformation insensitive to noise Model Choice Unsatisfied with the performance of our fish classifier and want to jump to another class of model. Training Use data to determine the classifier. Many different procedures for training classifiers and choosing models
  • 4. 4 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Evaluation  Measure the error rate  Different feature set  Different training methods  Different training and test data sets Computational Complexity What is the trade-off between computational ease and performance? Statistical Decision Making Parametric Decision Making In which we know or are willing to assume the general form of the probability distribution function or density function for each class, but not the values of the parameters such as mean or variance. Non Parametric Decision Making When we do not have sufficient basis of assuming even the general form of the relevant densities. Bayes’ Theorem • Bayesian decision making refers to choosing the most likely class, given the value of the feature or features. • The probability of class membership is calculated from Bayes‟ Theorem. • Let feature value is x and a class of interest is C • Then P(x) is the probability distribution of x in the entire population. • P(C) is the prior probability that a random sample is a member of class C. • P (x|C) is the conditional probability of obtaining x given that the sample is from C class. • We have to estimate the probability P (C|x) that a sample belongs to class C, given that it has the feature x. • Conditional Probability • The probability of occurring A given That B has occurred is denoted by P (A|B), and is read as “P of A given B”. • Since we know in advance that B has occurred, so P (A|B) is the fraction of B in which A occurs. Thus
  • 5. 5 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The conditional probability of a sample comes from class C and has the feature value x is • Rearranging • Which is known as Bayes‟ Theorem? The variable x can represent a single feature or a feature vector. Bayes’ Theorem for k-classes • Let C1… Ck are mutually exclusive i.e., they will not overlap each other and every sample belongs to exactly one of the classes. • If a sample belongs to one of the classes A or B, or both or neither, then four new mutually exclusive classes C1 ,C2 ,C3 ,and C4 defined by C1 = A and B C2 = A and B C3 = A and B C4= A and B • Thus k-nonexclusive classes could define up to 2k mutually exclusive classes. • Bayes Theorem for multiple features is obtained by replacing the value of a single feature x by the value of a feature vector x. • In the discrete case, if there are k classes we obtain A A+B B )( )( )|( BP BandAP BAP  )( )( )|( AP AandBP ABP  )|()()( BAPBPBandAP  )|()()( ABPAPAandBP  )|()()|()()( xCPxPCxPCPxandCP  )( )|()( )|( xP CxPCP xCP 
  • 6. 6 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Nonparametric Decision Making Nearest Neighbor Classification Techniques The single Nearest Neighbor Technique • Beyond of the problem of probability densities, the single Nearest Neighbor Technique completely and simply classifies an unknown sample as belonging to the relevant class as the most similar or “nearest” sample point in the training set of data, which is often called a reference set. • Nearest can mean the smallest Euclidean distance in n-dimensional feature space, which is the distance between two points And • Defined by • Where n is number of features. • Although Euclidean distance is the most commonly used measure of dissimilarity / similarity between feature vectors, it is not always the best metric. • Before summation, squaring the distance places emphasis on features with large dissimilarity. • A more moderate approach is simply the sum of the absolute differences in each feature, and saves computing time. • The distance metric would then be • The sum of absolute distances is sometimes called the city block distance, the Manhattan metric, or the taxi-cab distance. )...,.........( 1 naaa )..,.........( 1 nbbb   n i iie abd 1 2 )()( b,a ||)( 1 i n i icb abd   b,a
  • 7. 7 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Because it seems the distance between two locations in a city. If in a two-way street of rectangular shape, the number of blocks north (or south) plus the number of block east (or west) would equal the total distance traveled. • An extreme metric which considers only the most dissimilar pair of features is the Maximum distance metric • A generalization of the three distances is the Minkowski distance defined by • Where r is an adjustable parameter Clustering • Clustering refers to the process of grouping samples so that the samples are similar within each group. The groups are called clusters. • Clustering can be classified into two major types, Hierarchical and Partitioned clustering. Hierarchical clustering algorithms can be further divided into agglomerative and divisive. • Hierarchical clustering refers to a process that organizes data into large groups, which contain smaller groups, and so on. • Hierarchical clustering usually drawn pictorially by a tree or dendrogram in which the finest grouping is at the bottom, each sample forms a cluster. • Below is an example of a dendrogram • Hierarchical clustering algorithms are called agglomerative if they build the dendrogram from the bottom up and they are called divisive if they build the dendrogram from the top down. ||max)( 1 ii n i m abd   b,a rn i r iir abd 1 1 )(         b,a
  • 8. 8 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Agglomerative clustering algorithms with n number of samples is as below • Begin with n clusters, each consisting of one sample. • Repeat step 3 a total of n-1 times. • Find the most similar clusters Ci and Cj and merge Ci and Cj into one cluster. If there is a tie, merge the first pair found. Hierarchical Clustering • One way to measure the similarity between clusters is to define a function that measures the distance between clusters. • In cluster analysis nearest neighbor techniques are used to measure the distance between pairs of samples. The Single-Linkage Algorithm • It is also known as the minimum method or the nearest neighbor method. • The Single-Linkage Algorithm is obtained by defining the distance between two clusters to be the smallest distance between two points such that one point is in each cluster. • Formally, if Ci and Cj are clusters, the distance between them is defined as • Where d (a,b) denotes the distance between the samples a and b. Hierarchical Clustering: The Single-Linkage Algorithm Example • Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the distance between two samples. x y 1 4 4 2 8 4 3 15 8 4 24 4 5 24 12
  • 9. 9 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The smallest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of clusters become four : {1,2}, {3}, {4}, {5} {1,2} 3 4 5 {1,2} - 8.1 16.0 17.9 3 8.1 - 9.8 9.8 4 16.0 9.8 - 8.0 5 17.9 9.8 8.0 - • The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for S L Algorithm the distance between clusters {1,2} and {3} is the minimum 8.1 and so on. • Since the minimum value in the matrix is 8, clusters {4} & {5} are merged. • Thus in this level, There are three clusters: {1,2}, {3}, {4,5} {1,2} 3 {4,5} {1,2} - 8.1 16.0 3 8.1 - 9.8 {4,5} 16.0 9.8 - • Since the minimum value in this step is 8.1, thus clusters {1, 2} and {3} are merged. Now there are two clusters: {1, 2, 3} and {4, 5}. • The next step will merge the two remaining clusters at a distance of 9.8. Finally the dendrogram is as below. 1 2 3 4 5 1 - 4.0 11.7 20.0 21.5 2 4.0 - 8.1 16.0 17.9 3 11.7 8.1 - 9.8 9.8 4 20.0 16.0 9.8 - 8.0 5 21.5 17.9 9.8 8.0 -
  • 10. 10 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Hierarchical Clustering The Complete-Linkage Algorithm • It is also known as the maximum method or the farthest neighbor method. • And is obtained by defining the distance between two clusters to be the largest distance between a sample in one cluster and that in other cluster. • Formally, if Ci and Cj are clusters, we define Hierarchical Clustering: The Complete-Linkage Algorithm Example • Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the distance between two samples. x y 1 4 4 2 8 4 3 15 8 4 24 4 5 24 12
  • 11. 11 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The nearest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of clusters become four : {1,2}, {3}, {4}, {5} {1,2} 3 4 5 {1,2} - 11.7 20.0 21.5 3 11.7 - 9.8 9.8 4 20.0 9.8 - 8.0 5 21.5 9.8 8.0 - • The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for C L Algorithm the distance between clusters {1,2} and {3} is the Maximum 11.7 and so on. • Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged. • Thus in this level, There are three clusters: {1,2}, {3}, {4,5} {1,2} 3 {4,5} {1,2} - 11.7 21.5 3 11.7 - 9.8 {4,5} 21.5 9.8 - • Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are two clusters: {1, 2} and {3, 4, 5}. • The next step will merge the last two clusters at a distance of 21.5. The Average-Linkage Algorithm • The Average-Linkage Algorithm is a compromise between the extremes of the single- and complete- linkage algorithms. • It is also known as the unweighted pairgroup method using arithmetic averages (UPGMA). • And is obtained by defining the distance between two clusters to be the average distance between a sample in one cluster and that in other cluster. 1 2 3 4 5 1 - 4.0 11.7 20.0 21.5 2 4.0 - 8.1 16.0 17.9 3 11.7 8.1 - 9.8 9.8 4 20.0 16.0 9.8 - 8.0 5 21.5 17.9 9.8 8.0 -
  • 12. 12 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Formally, if Ci with ni members and Cj with nj members are clusters, we define • After the first table of past example, the clusters in second step was {1,2}, {3}, {4}, {5}. In this step, for A L Algorithm, the distance between clusters {1,2} and {3} will be the average of the distances d(1,3)=11.7 and d(2,3)=8.1, and so on. {1,2} 3 4 5 {1,2} - 9.9 18.0 19.7 3 9.9 - 9.8 9.8 4 18 9.8 - 8.0 5 19.7 9.8 8.0 - • Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged. Thus now the clusters are {1,2}, {3}, {4,5} {1,2} 3 {4,5} {1,2} - 9.9 18.9 3 9.9 - 9.8 {4,5} 18.9 9.8 - • Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are two clusters: {1, 2} and {3, 4, 5}. • The next step will merge the last two clusters at a distance of 14.4. Hierarchical Clustering: Ward’s Method • Word‟s Method is also called the minimum-variance method. It begins with one cluster for each sample. • At each iteration, among all cluster pairs, it merges the pair that produces the smallest squared error for the resulting set of clusters. The squared error for each cluster is defined as follows: • Let a cluster contains m samples x1,….,xm where xi is the feature vector (xi1,….,xid)
  • 13. 13 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The vector composed of the means of each feature is called the mean vector or centroid of the cluster. • The squared error for a cluster is the sum of the squared distances in each feature from the cluster members to their mean. • The squared error is thus equal to the total variance of the cluster times the number of samples in the cluster m, where the total variance is defined to be the sum of the variances of each feature. The squared error for a set of clusters is defined to be the sum of the squared errors for the individual clusters. x y 1 4 4 2 8 4 3 15 8 4 24 4 5 24 12 • Example: Begin with five cluster, one sample in each. The squared error is 0, 10 possible ways to merge a pair of clusters: merge {1} & {2}, merge {1} & {3}, and so on. • Let merging {1} and {2}, feature vector of sample 1 is (4,4) & feature vector of sample 2 is (8,4), so feature means are 6 & 4. The squared error for cluster {1,2}:
  • 14. 14 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The squared error for cluster {3}, {4}, {5} is 0. Thus the total squared error for the clusters {1,2},{3},{4},{5}: • 8+0+0+0=8. Clusters Squared Error, E {1,2},{3},{4},{5} 8.0 {1,3},{2},{4},{5} 68.5 {1,4},{2},{3},{5} 200.0 {1,5},{2},{3},{4} 232.0 {2,3},{1},{4},{5} 32.5 {2,4},{1},{3},{5} 128.0 {2,5},{1},{3},{4} 160.0 {3,4},{1},{2},{5} 48.5 {3,5},{1},{2},{4} 48.5 {4,5},{1},{2},{3} 32.0 • Since minimum error is 8, so merging {1, 2}, {3}, {4}, {5} is accepted. Clusters Squared Error, E {1,2,3},{4},{5} 72.7 {1,2,4},{3},{5} 224.0 {1,2,5},{3},{4} 266.7 {1,2},{3,4},{5} 56.5 {1,2},{3,5},{4} 56.5 {1,2},{4,5},{3} 40.0 • There are 6 possible sets of clusters resulting from {1, 2}, {3}, {4}, {5}. • From the table shown, the minimum squared error is 40 and it is for {1,2},{4,5},{3} • There are 3 possible sets of clusters resulting from {1,2},{4,5},{3}. • From the table shown, the minimum squared error is 94 and it is for {1,2},{3,4,5}
  • 15. 15 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • At Last, Two remaining clusters are merged and Hierarchical clustering is complete. Clusters Squared Error, E {1,2,3},{4,5} 104.7 {1,2,4,5},{3} 380.0 {1,2},{3,4,5} 94.0 • The resulting dendrogram is shown as below: Partitional Clustering • In partitional clustering, the goal is usually to create one set of clusters that partitions the data into similar groups. • Samples close to one another are assumed to be similar and the task is to group data that are closed together. • In many cases, the number of clusters to be constructed is specified in advance. • If a partitional clustering algorithm divide the data set into two groups, then each of these is further divided into two parts, and so on, a hierarchical dendrogram could be produced from the top-down. • The hierarchy produced by this divisive technique is more general than the bottom-up hierarchies because the groups can be divided into more than two subgroups in one step. • Another advantage of partitional techniques is that only the top part of the tree which shows the main groups and possibly their subgroups, may be required, and there may be no need to complete dendrogram. Partitional Clustering: Forgy’s Algorithm • Besides the data, input to the algorithm consists of k, the number of clusters to be constructed, and k samples called seed points. The seed points could be chosen randomly, or some knowledge of the desired cluster structure could be used to guide their selection.
  • 16. 16 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Step-1. Initialize the cluster centroids to the seed points. • Step-2. For each sample, find the cluster centroid nearest it. put the sample in the cluster identified with this nearest cluster centroid. • Step-3. If no samples changed clusters in step 2, stop. • Step-4. Compute the centroids of the resurting clusters and go to step 2. Forgy’s Algorithm: Example x y 1 4 4 2 8 4 3 15 8 4 24 4 5 24 12 • Set k=2 which will produce two clusters, and use the first two samples (4,4) and (8,4) in the list as seed points. • In this algorithm, the samples will be denoted by their feature vectors rather than their simple numbers to aid in the computation. • For step 2, find the nearest cluster centroid for each sample. Sample Nearest cluster centroid (4,4) (4,4) (8,4) (8,4) (15,8) (8,4) (24,4) (8,4) (24,12) (8,4) • The ctusters {(4, 4)} and {(8,4), (15,8), (24,4), (24,12)} are produced. • For step 4, compute the centroids of the clusters. The centroid of the first and second clusters are (4,4) and (17.75,7) since (8+15+24+24)/4=17.75 (4+8+4+12)/4=7 Sample Nearest
  • 17. 17 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 cluster centroid (4,4) (4,4) (8,4) (4,4) (15,8) (17.75,7) (24,4) (17.75,7) (24,12) (17.75,7) • Some sample changed cluster, return to step-2 • Resulting table shows the results. The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are produced. • Again for step 4, compute the centroids (6,4) and (21, 8) of the clusters. Since the sample (8, 4) changed clusters, return to step 2. Sample Nearest cluster centroid (4,4) (6,4) (8,4) (6,4) (15,8) (21, 8) (24,4) (21, 8) (24,12) (21, 8) • Find the cluster centroid nearest each sample. Table shows the results. • The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are obtained. • For step 4, compute the centroids (6, 4) and (21, 8) of the clusters. • Since no sample will change clusters, the algorithm terminates.
  • 18. 18 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Partitional Clustering: k-means Algorithm • An alternative version of the, k-means algorithm iterates step 2. Specifically step-2 is replaced by the following steps 2 through 4: • 2. For each sample, find the centroid nearest it. Put the sarnple in the cluster identified with this nearest centroid. • 3. If no samples changed clusters, stop • 4. Recompute the centroids of altered clusters and go to step 2. K-means Algorithm: Example • Set k: 2 and assume that the data are ordered so that the first two sarnples are (8,4) and (24,4). • For step 1, begin with two clusters {(8,4)} and {(24,4)} which have centroids at (8,4) and (24,4). For each of the remaining three sa,rnples, find the centroid nearest it, put the sample in this cluster, and recompute the centroid of this cluster. • The next sample (15, 8) is nearest the centroid (8,4) so it joins cluster {(8,4)}. • At this point, the clusters are {(8,4),(15,8)} and {(24,4)}. The centroid of the first 3 cluster is updated to (11.5, 6) since (8+15)/2=1.1.5, (4+8)/2=6. • The next sample (4, 4) is nearest the centroid (11.5,6) so it joins cluster {(8,4), (15,8)}. At this point, the clusters are {(8,4),(15,8),(4,4)} and {(24,4)}. The centroid of the first cluster is updated to (9, 5.3).
  • 19. 19 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The next sample (24, 12) is nearest the centroid (24,,4) so it joins cluster {(24,4)}. At this point, the clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}. The centroid of the second cluster is updated to (24, 8). At this point, step 1 of the algorithm is complete. • For step 2, examine the sarnples one by one and put each one in the cluster identified with the nearest centroid. As Table shows, in this case no sarnple changes clusters. • The resulting clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}. Sample Distance to Centroid (9, 5.3) Distance to cetroid (24, 8) (8, 4) 1.6 16.5 (24,4) 15.1 4.0 (15, 8) 6.6 9.0 (4,4) 6.6 40.4 (24,12) 16.4 4.0 • The goal of Forry's algorithm and the, k-means algorithm is to minimize the squared error for a fixed number of clusters. These algorithms assign samples to clusters so as to reduce the squared error and, in the iterative versions, they stop when no further reduction occurs. • However, to achieve reasonable computation time, they do not consider all possible clusterings. For this reason, they sometimes terminate with a clustering that achieves a local minimum squared error. • Furthermore, in general, the clusterings, that these algorithms generate depend on the choice of the seed points. • If Forgy's algorithm is applied to the original data using (8, 4) and (24, 4) as seed points, the algorithm terminates with the clusters {(4, 4), (8, 4), (15, 8)}, {(24, 4), (24, 12)}. • This is different from the clustering produced in forgy‟s. The above clustering has a squared error of 104.7 whereas the Forgy‟s clustering has a squared error of 94. • The clustering above produces a local minimum and the forgy‟s clustering can be shown to produce a global minimum. • For a given set of seed points, the resulting clusters may also depend on the order in which the points are checked. Neural Network: Introduction
  • 20. 20 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • It was more than 2000 years ago; our ancestors had started to discover the architecture and behavior of human brain. • Ramon Y. Cajal and Hebb continued the work of Aristotle and tried to build the artificial "thinking machine". • Based on the information about the functions of the brain and the quest for obtaining a mathematical model for our learning habits, a new technology Artificial Neural Networks was started. • Our brain can process information quickly and accurately. You can recognize your friend's voice in a noisy railway station. How the brain is able to process the voice signal added with the noise and retrieve the original signal? • Can we duplicate this amazing process through a machine? Can we make a machine to duplicate some learning habits of a human? Can a machine be made to learn from experience? • We will get answer during the study of Neural Network. Neural Network: Definition • An artificial neural network is an information processing system that has been developed as a generalization of the mathematical model of human cognition (sense of knowing). • A neural network is a network of interconnected neurons, inspired from the studies of the biological nervous system. In other words, neural network functions in a way similar to the human brain. • The function of a neural network is to produce an output pattern when presented with an input pattern. • Neural network is the study of networks consisting of nodes connected by adaptable weights, which store experimental knowledge from task examples through a process of learning. • The nodes of the brain are adaptable; they acquire knowledge through changes in the node weights by being exposed to samples. Neural Network: Biological Neural Net. • Neural network architectures are motivated by models of the human brain and nerve cells. Our current knowledge of human brain is limited to its anatomical and physiological information. • Neuron (from Greek, meaning nerve cell) is the fundamental unit of the brain. The neuron is a complex biochemical and electrical signal processing unit that receives and combines signals from many other neurons through filamentary input paths, the dendrites (Greek: tree links). • A biological neuron has three types of components namely dendrites, soma and axon. Dendrites are bunched into highly complex "dendritic trees", which have an enormous total surface area. The dendrites receive signals from other neurons. • Dendritic trees are connected with the main body of the neuron called the soma (Greek: body). • The soma has a pyramidal or cylindrical shape. The soma sums the incoming signals. When sufficient input is received, the cell fires.
  • 21. 21 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The output area of the neuron is a long fiber called axon. The impulse signal triggered by the cell is transmitted over the axon to other cells. • The connecting point between a neuron's axon and another neuron‟s dendrite is called a synapse (Greek: contact). The impulse signals are then transmitted across a synaptic gap by means of a chemical process. • A single neuron may have 1000 to 10000 synapses and may be connected with around 1000 neurons. There are 100 billion neurons in our brain, and each neuron has 1000 dendrites. Neural Network: Artificial Neuron • The artificial neuron (also called processing element or node) mimes the characteristics of the biological neuron. A processing element possesses a local memory and carries out localized information processing operations. • The artificial neuron has a set of „n‟ inputs xi, each representing the output of another neuron. • The subscript i in xi take values between i and n and indicates the source of the vector input signal. • The inputs are collectively referred to as X. • Each input is weighed before it reaches the main body of the processing element by the connection strength or the weight factor (or simply weight) analogous to the synaptic strength. • The amount of information about the input that is required to solve a problem is stored in the form of weights. Each signal is multiplied with an associated weight w1, w2, w3... wn before it is applied to the summing block. • In addition, the artificial neuron has a bias term w0, a threshold value „θ „that has to be reached or extended for the neuron to produce a signal, a nonlinear function 'F' that acts on the produced signal 'net' and an output 'y' after the nonlinearity function.
  • 22. 22 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The following relation describes the transfer function of the basic neuron model. • y = F (net) • Where • net = w0 + x1w1 + x2w2 + x3w3 +...... + xnwn • or • and the neuron firing condition is: [For linear activation function], x0=1 • Or [For nonlinear activation function] Neural Network: Classification • Artificial neural networks can be classified on the basis of 1. Pattern of connection between neurons, (architecture of the network) 2. Activation function applied to the neurons 3. Method of determining weights on the connection (training method) Neural Network: ARCHITECTURE   n i ii wxwnet 0 0 0i ii wx )(netF
  • 23. 23 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • The neurons are assumed to be arranged in layers, and the neurons in the same layer behave in the same manner. • All the neurons in a layer usually have the same activation function. Within each layer, the neurons are either fully interconnected or not connected at all. • The neurons in one layer can be connected to neurons in another layer. • The arrangement of neurons into layers and the connection pattern within and between layers is known as network architecture. Input layer: • The neurons in this layer receive the external input signals and perform no computation, but simply transfer the input signals to the neurons in another layer. Output layer: • The neurons in this layer receive signals from neurons either input layer or in the hidden layer. Hidden layer: • The layer of neurons that are connected in between the input layer and the output layer is known as hidden layer. • Neural nets are often classified as single layer networks or multilayer networks. • The number of layers in a net can be defined as the number of layers of weighted interconnection links between various layers. • While determining the number of layers, the input layer is not counted as a layer, because it does not perform any computation. • The architecture of a single layer and a multilayer neural network is shown in the following figures. Single Layer Network • A single layer network consists of one layer of connection weights. The net consists of a layer of units called input layer, which receive signals from the outside world and a layer of units called output layer from which the response of the net can be obtained. • This type of network can be used for pattern classification problems
  • 24. 24 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Multilayer Network: • A multilayer network consists of one or more layers of units (called hidden layers) between the input and output layers. Multilayer networks may be formed by simply cascading a group of layers; the output of one layer provides the input to the subsequent layer. • A multilayer net with nonlinear activation function can solve any type of problem. • However training a multilayer neural network is very difficult. Multilayer Network: Neural Network: ACTIVATION FUNCTIONS • The purpose of nonlinear activation function is to ensure that the neuron's response is bounded - that is, the actual response of the neuron is conditioned or damped, as a result of large or small activating stimuli and thus controllable. • Further, in order to achieve the advantages of multilayer nets compared with the limited capabilities of single layer networks, nonlinear functions are required. • Different nonlinear functions are used, depending upon the paradigm and the algorithm used for training the network. • The various activation functions are: • Identity function (Linear function): • Identity function can be expressed: f(x) = x for all x. • Binary step function: Binary step function is defined as:
  • 25. 25 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
  • 26. 26 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Training an Artificial Neural Network • The most important characteristic of an artificial neural network is its ability to learn. • Generally, learning is a process by which a neural network adapts itself to a stimulus by properly making parameter adjustments and producing a desired response. • Learning (training) is a process in which the network adjusts its parameters the (synaptic weights) in response to input stimuli so that the actual output response converges to the desired output response. • When the actual output response is the same as the desired one, the network has completed the learning phase and the network has acquired knowledge. • Learning or training algorithms can be categorized as:  Supervised training  Unsupervised training  Reinforced training Supervised Training: • Supervised training requires the pairing of each input vector with a target vector representing the desired output. These two vectors are termed together as training pair. • During the training session an input vector is applied to the net, and it results in an output vector. • This response is compared with the target response. If the actual response differs from the target, the net will generate an error signal. • This error signal is then used to calculate the adjustment that should be made in the synaptic weights so that the actual output matches the target output. • The error minimization in this kind of training requires a supervisor or a teacher, hence the name supervised training. • In artificial neural networks, the calculation that is required to minimize errors depends on the algorithm used, which is normally based on the optimization techniques.
  • 27. 27 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Supervised training methods are used in to perform nonlinear mapping in pattern classification nets. Pattern association nets and multilayer neural nets. Unsupervised Training: • Unsupervised training is employed in self-organizing nets and it does not require a teacher. • In this method, the input vectors of similar types are grouped without the use of training data to specify how a typical member of each group looks or to which group a member belongs. • During training the neural network receives input patterns and organizes these patterns into categories. When new input pattern is applied, the neural network provides an output response indicating the class to which the input pattern belongs. • If a class cannot be found for the input pattern, a new class is generated. • Even though unsupervised training does not require a teacher, it requires certain guidelines to form groups. • Grouping can be done based on color, shape or any other property of the object. If no guidelines are given grouping may or may not be successful. Reinforced Training • Reinforced training is similar to supervised training. In this method, the teacher does not indicate how close the actual output to the desired output is, but yields only a pass or a fail indicator. Thus, the error signal generated during reinforced training is binary. Mcculloch - Pitts Neuron Model Warren McCulloch and Walter Pitts presented the first mathematical model of a single biological neuron in 1943. This model is known as McCulloch - Pitts model. • This model is not requiring learning or adoption and the neurons are binary activated. If the neuron fires, it has an activation of l and otherwise, it has an activation of 0. • The neurons are connected by excitatory or inhibitory weights. Excitatory connection has positive weights, and inhibitory connection has negative weights. • All the excitatory connection in a particular neuron have the same weight. Each neuron has a fixed threshold such that if the net input to the neuron is greater than the threshold the neuron should fire. • The threshold is set such that the inhibition is absolute. This means any non-zero inhibitory input will prevent the neuron from firing.
  • 28. 28 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 Implementation of McCULLOCH - PITTS Networks for logic functions
  • 29. 29 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 2. OR Function 3. NOT Function
  • 30. 30 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 4. AND NOT Function 5. XOR Function Applications of Neural Networks • There have been many impressive demonstrations of artificial neural networks. A few areas where neural networks are mentioned below. Classification
  • 31. 31 @ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592 • Which is an important aspect in image classification? Neural successfully in a large number of classification tasks which includes (a) Recognition of printed or handwritten characters. (b) Classification of SONAR and RADAR signals. Signal Processing • In digital communication systems, distorted signals cause inter-signal interference. • One of the first commercial applications of ANN was to suppress noise cancellation and it was implemented by Widrow using ADALINE. • The ADALINE is trained to remove the noise from the telephone line signal. Speech Recognition • In recent years, speech recognition has received enormous attention. • It involves three modules namely; the front end which samples the speech signals and extracts the data. • The word processor, finds the probability of words in the vocabulary. • The sentence processor, to determine the sense in the sentence. McCULLOCH – PITTS: NOT Function • Medicine • Intelligent control • Function Approximation • Financial Forecasting • Condition Monitoring • Process Monitoring and Control • Neuro Forecasting • Pattern Analysis